Degrees of Freedom Tutorial

A lot of researchers seem to be struggling with their understanding of the statistical concept of degrees of freedom. Most do not really care about why degrees of freedom are important to statistical tests, but just want to know how to calculate and report them. This page will help. For those interested in learning more about degrees of freedom, take a look at the following resources:

I couldn’t find any resource on the web that explains calculating degrees of freedom in a simple and clear manner and believe this page will fill that void. It reflects my current understanding of degrees of freedom, based on what I read in textbooks and scattered sources on the web. Feel free to add or comment.


Conceptual Understanding


Let’s start with a simple explanation of degrees of freedom. I will describe how to calculate degrees of freedom in an F-test (ANOVA) without much statistical terminology. When reporting an ANOVA, between the brackets you write down degrees of freedom 1 (df1) and degrees of freedom 2 (df2), like this: “F(df1, df2) = …”. Df1 and df2 refer to different things, but can be understood the same following way.

Imagine a set of three numbers, pick any number you want. For instance, it could be the set [1, 6, 5]. Calculating the mean for those numbers is easy: (1 + 6 + 5) / 3 = 4.

Now, imagine a set of three numbers, whose mean is 3. There are lots of sets of three numbers with a mean of 3, but for any set the bottom line is this: you can freely pick the first two numbers, any number at all, but the third (last) number is out of your hands as soon as you picked the first two. Say our first two numbers are the same as in the previous set, 1 and 6, giving us a set of two freely picked numbers, and one number that we still need to choose, x: [1, 6, x].  For this set to have a mean of 3, we don’t have anything to choose about x. X has to be 2, because (1 + 6 + 2) / 3 is the only way to get to 3. So, the first two values were free for you to choose, the last value is set accordingly to get to a given mean. This set is said to have two degrees of freedom, corresponding with the number of values that you were free to choose (that is, that were allowed to vary freely).

This generalizes to a set of any given length. If I ask you to generate a set of 4, 10, or 1.000 numbers that average to 3, you can freely choose all numbers but the last one. In those sets the degrees of freedom are respectively, 3, 9, and 999. The general rule then for any set is that if n equals the number of values in the set, the degrees of freedom equals n - 1.

This is the basic method to calculate degrees of freedom, just n - 1. It is as simple as that. The thing that makes it seem more difficult, is the fact that in an ANOVA, you don’t have just one set of numbers, but there is a system (design) to the numbers. In the simplest form you test the mean of one set of numbers against the mean of another set of numbers (one-way ANOVA). In more complicated one-way designs, you test the means of three groups against each other. In a 2 x 2 design things seem even more complicated. Especially if there’s a within-subjects variable involved (Note: all examples on this page are between-subjects, but the reasoning mostly generalizes to within-subjects designs). However things are not as complicated as you might think. It’s all pretty much the same reasoning: how many values are free to vary to get to a given number?

Df1


Df1 is all about means and not about single observations. The value depends on the exact design of your test. Basically, the value represents the number of cell means that are free to vary to get to a given grand mean. The grand mean is just the mean across all groups and conditions of your entire sample. The cell means are nothing more than the means per group and condition. We’ll call the number of cells (or cell means) k.

Let’s start off with a one-way ANOVA. We have two groups that we want to compare, so we have two cells. If we know the mean of one of the cells and the grand mean, the other cell must have a specific value such that (cell mean 1 + cell mean 2) / 2 = grand mean (this example assumes equal cell sample sizes, but unequal cell sample sizes would not change the number of degrees of freedom). Conclusion: for a two-group design, df1 = 1.

Sticking to the one-way ANOVA, but moving on to three groups. We now have three cells, so we have three means and a grand mean. Again, how many means are free to vary to get to the given grand mean? That’s right, 2. So df1 = 2. See the pattern? For one-way ANOVA’s df1 = k - 1.

Moving on to an ANOVA with four groups. We know the answer if this is a one-way ANOVA (that is, a 4 x 1 design): df1 = k - 1 = 4 -1 = 3. However, what if this is a two-way ANOVA (a 2 x 2 design)? We still have four means, so to get to a given grand mean, we can have three freely varying cell means, right? Although this is true, we have more to deal with than just the grand mean, namely the marginal means. The marginal means are the combined cell means of one variable, given a specific level of the other variable. Let’s say our 2 x 2 ANOVA follows a 2 (gender: male vs. female) x 2 (eye color: blue vs. brown) design. In that case, the grand mean is the average of all observations in all 4 cells. The marginal means are the average of all eye colors for male participants, the average of all eye colors for female participants, the average of all genders for blue-eyed participants, and the average of all genders for brown-eyed participants. The following table shows the same thing:

Brown eyes Blue eyes
Male CELL MEAN
Brown eyed males
CELL MEAN
Blue eyed males
MARGINAL MEAN
of brown eyed males
and blue eyed males
Female CELL MEAN
Brown eyed females
CELL MEAN
Blue eyed females
MARGINAL MEAN
of brown eyed females
and blue eyed females
MARGINAL MEAN
of brown eyed males
and brown eyed females
MARGINAL MEAN
of blue eyed males
and blue eyed females
GRAND MEAN

The reason that we are now dealing with marginal means is that we are interested in interactions. In a 4 x 1 one-way ANOVA, no interactions can be calculated. In our 2 x 2 two-way ANOVA, we can. For instance, we might be interested in whether females perform better than males depending on their eye color. Now, because we are interested in cell means differences in a specific way (i.e., we are not just interested in whether one cell mean deviates from the grand mean, but we are also interested in more complex patterns), we need to pay attention to the marginal means. As a consequence, we now have less freedom to vary our cell means, because we need to account for the marginal means (if you want to know how this all works, you should read up on how the sums of squares are partitioned in 2 x 2 ANOVA’s). It is also important to realize that if all marginal means are fixed, the grand mean is fixed too. In other words, we do not have to worry about the grand mean anymore for calculating our df1 in a two-way ANOVA, because we are already worrying about the marginal means. As a consequence, our df1 will not lose a degree of freedom because we do not want to get to a specific grand mean. Our df1 will only lose degrees of freedom to get to the specific marginal means.

Now, how many cell means are free to vary before we need to fill in the other cell means to get to the four marginal means in the 2 x 2 design? Let’s start with freely picking the cell mean for brown eyed males. We know the marginal mean for brown eyed males and blue eyed males together (it is given, all marginal means are), so I guess we can’t choose the blue eyed males cell mean freely. There goes one degree of freedom. We also know the marginal mean for brown eyed males and brown eyed females together. That means we can’t choose the brown eyed female cell mean freely either. And as we know the other two marginal means, we have no choice in what we put in the blue eyed females cell mean to get to the correct marginal means. So, we chose one cell mean, and the other three cell means had to be filled in as a consequence to get to the correct marginal means. You know what that means don’t you? We only have one degree of freedom in df1 for a 2 x 2 design. That’s different from the three degrees of freedom in a 4 x 1 design. The same number of groups and they might even contain the same observations, but we get a different number of degrees of freedom. So now you see that using the degrees of freedom, you can infer a lot about the design of the test.

You could do the same mental exercise for a 2 x 3 design, but it is tedious for me to write up, so I am going to give you the general rule. Every variable in your design has a certain number of levels. Variable 1 in the 2 x 3 design has 2 levels, variable 2 has 3 levels. You get df1 when you multiply the levels of all variables with each other, but with each variable, subtract one level. So in the 2 x 3 design, df1 would be (2 – 1) x (3 – 1)  = 2 degrees of freedom. Back to the 2 x 2 design, df1 would be (2 – 1) x (2 – 1) = 1 degrees of freedom. Now let’s see what happens with a 2 x 2 x 2 design: (2 – 1) x (2 – 1) x (2 – 1) = still 1 degrees of freedom. A 3 x 3 x 4 design (I hope you’ll never have to analyze that one): (3 – 1) x ( 3 – 1) x (4 -1) = 2 x 2 x 3 = 12 degrees of freedom.

By now, you should be able to calculate df1 in F(df1, df2) with ease. By the way, most statistical programs give you this value for free. However, now you’ll be able to judge whether researchers have performed the right analyses in their papers to some extent based on their df1 value. Also, df1 is calculated the same way in a within-subjects design. Just treat the within-subjects variable as any other variable. Let’s move on to df2.

DF2


Whereas df1 was all about how the cell means relate to the grand mean or marginal means, df2 is about how the single observations in the cells relate to the cell means. Basically the df2 is the total number of observations in all cells (n) minus the degrees of freedoms lost because the cell means are set (that is, minus the number of cell means or groups/conditions: k). Df2 = n - k, that’s all folks! Say we have 150 participants across four conditions. That means we will have df2 = 150 – 4 = 146, regardless of whether the design is 2 x 2, or 4 x 1.

Most statistical packages give you df2 too. In SPSS, it’s called df error, in other packages it might be called df residuals.

For the case of within subjects-designs, things can become a bit more complicated. The following paragraphs are work in progress. The calculation of df2 for a repeated measures ANOVA with one within-subjects factor is as follows: df2 = df_total – df_subjects – df_factor, where df_total = number of observations (across all levels of the within-subjects factor, n) – 1, df_subjects = number of participants (N) – 1, and df_factor = number of levels (k) – 1. Basically, the take home message for repeated measures ANOVA is that you lose one additional degree of freedom for the subjects (if you’re interested: this is because the sum of squares representing individual subjects’ average deviation from the grand mean is partitioned separately, whereas in between-subjects designs, that’s not the case. To get to a specific subjects sum of squares, N – 1 subject means are free to vary, hence you lose one additional degree of freedom).

Conclusion


You should be able to calculate df1 and df2 with ease now (or identify it from the output of your statistical package like SPSS). Keep in mind that the degrees of freedom you specify are those of the design of the effect that you are describing. There is no such thing as one set of degrees of freedom that is appropriate for every effect of your design (although, in some cases, they might seem to have the same value for every effect).

Moreover, although we have been discussing means in this tutorial, for a complete understanding, you should learn about sums of squares, how those translate into variance, and how test statistics, such as F-ratio, work. This will make clear to you how degrees of freedom are used in statistical analysis. The short functional description is that, primarily, degrees of freedom affect which critical values are chosen for test statistics of interest, given a specific alpha level (remember those look-up tables in your early statistics classes?).

If you have any questions, remarks, or corrections; please leave them in the comments below.



90 Comments

  1. Great explanation! Thank you so much for taking the time to make this page.

    Reply

  2. Barbara Khavugwi October 27, 2011 at 11:49 AM

    Great explanation, I have gained so much insight

    Reply

  3. Cheers for that! It certainly aided my understanding.

    Reply

  4. This is really helpful! Thank you :)

    Reply

  5. its awesome, great explanation you got!, kudos

    Reply

  6. Thank you. Your response was succinct and useful. I hope that you address the ANOVA tables in future. Some tables yield F, Sig. Partial Eta Squared, Noncent. Parametr, and Observed Power. Others that are somewhat easier to read yield only Type III SS, Mean Square, F value, and Pr > F. I would like see them debunked as ably as you addressed the question of why there are two numbers for degrees of freedom

    Reply

    1. Thank you for the positive feedback. I’ve put this on my todo list, but might not get to it soon.

      Reply

  7. Thanks, Ron, great page, helped a lot already.
    Would you mind going one step further on the repeated measures design and explain how df2 is derived for two within-subjects factors?

    Reply

    1. Thanks! I want to work on expanding this tutorial to the within-subjects case this summer, as soon as I have time.

      Reply

  8. Thank you very much! this essentially cleared up the confusion I’ve had for years about degrees of freedom.

    By the way, one point. you mentioned
    (cell mean 1 + cell mean 2) / 2 = grand mean. Shouldn’t the grand mean be (n_{cell1}*cell mean 1 + n_{cell2}* cell mean 2)/ (n_{cell1}+n_{cell2} )? I am a bit confused on this.

    Reply

    1. You’re correct, the example I gave assumes equal cell sample sizes. I’ll mention that in the text. Thanks!

      Reply

  9. Thank you very much for your explanation, it was really clear. I am, however, still a bit confused, as my calculation does not coincide with the SPSS output. I want to calculate df2 for a repeated measures ANOVA with one within-subjects factor (4 levels), N = 68. Following your formula, this should be df-total (= 271) – df_subjects (=67) – df_factor (= 3) = 201. SPSS nevertheless tells me that df2 = 192. Do you happen to know what is going on – am I just making some fundamental mistake, or does SPSS use a different formula that yields a different result?

    Reply

    1. Hi Laura. I have to admit, I am still struggling a bit with the repeated measures degrees of freedom (as I wrote above, that paragraph is work in progress). This seems like a good opportunity to figure this out once and for all. Could you e-mail me your SPSS output so I can take a look?

      Reply

  10. Ankit Aggarwal July 1, 2012 at 12:11 PM

    Good article .Though I have a doubt about your example regarding df1 .

    In your example of 2 X 2 matrix ,if I just fix then one cell mean ,then also I can’t identify the last cell’s mean given the constraint of 2 specific marginal means to satisfy .Hence df1 should be zero by your logic .Please explain .

    Thanks a lot in advance .

    Reply

    1. Dear Ankit, I do not completely follow you. If I have four marginal means in the 2 x 2 matrix as constraint, there are many different combinations I can choose. Then, for one cell, say in the top-left, I choose a number (giving me one df), the numbers of all the other cells immediately become fixed, in order to satisfy the marginal means constraint. This leaves me one df.

      Reply

  11. Seriously, this is the best explanation of degrees of freedom I have ever read. Thank you so much for taking the time to share it.

    Reply

  12. realy the page solve for me a long year problem respect to degree of freedom !thank you very much more than i can say next to God

    Reply

  13. Greatest explantion to DF ever……….gratitudes…

    Reply

  14. I am new to research field so don’t know whatever is written is wrong or right but is explained pretty well….. and my understanding about degree of freedom enhanced very much with this…. thanks a lot

    Reply

  15. Thanks for writing this, like the other commenters, this was the most clear explanation of df I’ve come across yet. I have a presentation on Two-way ANOVA to do and this was a GREAT help!

    Reply

  16. [...] Ron Dotsch, Degrees of Freedom Tutorial This entry was posted in Uncategorized. [...]

    Reply

  17. Thanks so much!!!

    Reply

  18. Great explanation
    Simple and clear as it should be!

    Reply

  19. simple way to explain about degree of freedom thanx

    Reply

  20. great explanation. thanks alot

    Reply

  21. dear RON, can i have your email id, so that i can communicate to you should i get a query any time
    My Email Id is : drniteenkumar@gmail.com
    thanks

    Reply

  22. Thanks a lot for the clear cut explanation.

    Reply

  23. Hi Ron,
    Really clear explanations. I appreciate it! But there is this question that bugs me, and I don’t think it was addressed in this tutorial. Why Degrees of Freedom? Why is the degree of freedom used in the ANOVA calculations? In the computation of mean squares, the total sum of squares is divided by the degrees of freedom rather than the total number “n”. Why so?

    Reply

    1. Hi Shiva,
      Basically, this is because it has been shown mathematically that dividing by n gives you what is called a biased estimation (it structurally over-estimates the value) while dividing by df gives you an unbiased estimation (it might still be the wrong value, but at least whether it over- or under-estimates that value is random). I have never really tried to understand the actual mathematical proof for this, but I am comfortable enough with knowing that very smart people have shown this to be the case.

      Reply

      1. Oh ok I see …
        Thanks!

        Reply

  24. Thank you for this explanation. Here’s somethinge else for your to-do-list. Can you please explain how to calculate the degrees of fredom for a repeated measures ANOVA after a Greenhouse-Geisser correction?

    Reply

    1. Hi Kristoffer, that’s a very specific question. Have you tried reading:

      Greenhouse, S.W., & Geisser, S. (1959). On methods in the analysis of profile data. Psychometrika, 24, 95-112

      By the way, the short answer is: multiply df1 and df2 with the Greenhouse-Geisser estimate.

      Reply

  25. This is a very helpful post on degree of freedom….
    But sir how i detemine the degree of freedom for a set of data containing above 4oo data…?

    Reply

    1. The number of data points does not change the way you calculate degrees of freedom. It depends completely on the design of the test you’re performing.

      Reply

  26. Thanks Dr. Ron. If someone is looking for speedy but elegant explanation of this forever difficult concept among starters, this is the webpage to be.

    Reply

  27. Too good. Author should write more and more such tutorials and may be think of having a Google hangout sessions! Awesome! Thanks a lot.

    Reply

  28. Plz tell me about values in any set that average to example 3 or 4 ….

    Reply

    1. Hi, can you please be more specific?

      Reply

  29. Hi,
    Thank you for your explanation, it helps me a lot.
    I have one question: in multiple linear regression, when select between 2 possible models, model with less predictor variables is also prefer since we will save 1df. I do not understand what is the point of saving 1 df, and when it is worth to lose 1 df for model with more variables.
    Thank you and I wish you all the best.

    Reply

  30. Thank you, I found this a very useful explanation.

    Reply

  31. Dear Ron,

    that was a very thorough explanation. Great job. You can see that this is a never-ending issue. I have a comment. You explain how to calculate the degrees of freedom once the experimental setup is done, and you clearly state that stats software does it for you. But another relevant question that software does not say is how many degrees of freedom are needed for a given experiment to obtain reasonably good estimates of variability. For example, in a 2×2 ANOVA design, is it the same if we measure 5 replicates or if we measure 10 replicates per marginal cell? In both cases we would have sufficient data for analysis, but which one is better? Put it another way, how many reps are needed? That is a tough question frequently.

    Cheers

    Reply

    1. Hi David,

      That’s a good question. The answer is, more observations is better, but too many observations will get any statistical null-hypothesis test significant, regardless of how big your effect is. So the solution here is to do power-analysis in order to know how many observations you would need. A great program to do this, is G*Power. Besides that, in a recent talk at the SPSP conference by Joseph Simmons, a neat heuristic was proposed: at the minimum, use 50 observations per cell (this is the number of observations you would minimally need to observe the effect that men weigh more than women most of the time; if you think the effect you’re studying is smaller than that, you will need even more observations). I hope this helps a bit.

      Best,

      Ron

      Reply

  32. Very useful resource, simple to understand thanks

    Reply

  33. Very helpful and clear, thank you.

    Reply

  34. Othman Belgassim March 8, 2013 at 8:51 PM

    Thank you, I found this very helpful and simple to follow

    Reply

  35. This is brilliant! Best explanation I have ever read. Thank you very much!

    Reply

  36. i understood in 15 minutes what my lecturer has been trying to explain in two lessons,, THANK YOU..

    Reply

  37. This was so helpful! Thank you SOOOO much :)

    Reply

  38. Thank you so much for writing this! It was very helpful!!!! :)

    Reply

  39. Thanks. Really appreciated your work. When studying online one looks broader afield than lecture notes. This helped greatly. Narrative is SO important in understanding concepts. Often stats lecturers dont get that. Particularly because they KNOW the stuff so well but forget that we the students DO NOT! Cheers from Australia!

    Reply

  40. Wolfgang STein May 7, 2013 at 10:33 PM

    Excellent explanation. Thanks so much!

    Reply

  41. ankur srivastava May 9, 2013 at 12:07 PM

    simple and hits to the concept. very helpful for understanding in quick time :)

    Reply

  42. This was so good and helpful! Have a stats test and really needed to understand this! Thank you for your effort in putting this together!

    Reply

  43. Helped a lot. Thanks!

    Reply

  44. Hi,

    Thank you for the explanation. I am still a little confused however. When doing a RM-2-way ANOVA and I want to report the df1 and df2, is the df2 always the same? e.g. Say I have 2 groups of people (male vs female) and those same people are tested at daily (ie. 2 independant variables, gender and day). If I want to report the f value for gender and the f value for day I understand that the df1 will be different because there are only 2 groups for gender but multple groups for day (7 days). What about the df2? I am still a bit confused about that and different scientific papers seem to report different things so I’m not sure what to rely on.

    Thank you

    Reply

    1. Agh! Ron, you answered the other ones, why not this one! Same question from me!

      Reply

  45. You’ve made this so much easier to get my head around that i feel compelled to humbly give my thanks

    Reply

  46. thank you so much, it was excellent.

    Reply

  47. Please advise about the benefits to the degree of freedom in research?
    thank you

    Reply

  48. thanx a lot for such a clear and exact explanation of DOF.

    Reply

  49. Great explanation. But what about degrees of freedom in MANOVA?

    Reply

  50. This is an amazing explanation. No one I’ve ever met, no matter how familiar with statistics, has ever been able to explain this idea as clearly as you have done here.

    Reply

  51. Very helpful intuitive explanation of degrees of freedom but for a more rigorous explanation as to why degrees of freedom constitute the dimension of a space that contains a statistical vector see Wickens, T.D. (1995) The Geometry of Multivariate Statistics, Lawrence Erlbaum Associates, New Jersey, reprinted 2009 by Psychology Press, New York and Hove, East Sussex. Wickens’s book gives the clearest explanation I have EVER seen. This book explains means and standard deviations, correlation coefficients, R-squareds, bivariate and multiple regression, ANOVA, spherical errors and much more, all in the context of simple geometry. I can’t recommend this book enough!

    Reply

    1. Thanks, sounds like a strong recommendation. I will check it out!

      Reply

  52. “Df2 = n – k, that’s all folks! Say we have 150 participants across four conditions. That means we will have df2 = 150 – 4 = 146, regardless of whether the design is 2 x 2, or 4 x 1.”

    For me it is easier to think in the framework of linear regression when thinking about degrees of freedom. For df within it follows that df = n – number of regression weights. In a 4 * 1 Design, the number of regression weights is 4 (3 dummy variables + intercept), thus for your example 150 – 4 = 146. In a 2 * 2 Design, the number of regression weights is 3 (1 dummy variable to code each condition + intercept), thus for your example 150 – 3 = 147. Am I missing something?

    Reply

    1. In a 2 * 2 design, the number of regression weights is also 4 (2 dummies for the two main effects, 1 dummy for the interaction effect, and the intercept), so you end up with 150 – 4 = 146.

      Reply

  53. anyone can understand things but it takes a genius to simplify things. you are a genius!
    thanks a lot for the wonderful explanation sir.

    Reply

  54. good explanation . but i have question that why we use degree of freedom on econometric model?

    Reply

    1. Thank you very much I am worry of this term all ways. but you put with good explanation, thank you

      Reply

  55. […] Degrees of Freedom Tutorial | Ron Dotsch read Degrees of freedom […]

    Reply

  56. Dear Sir,
    the way of explaning is much easier ever i read before and thanks truely to make better understanding about the topic,,,,so please guide us as much as you can i have got other problem of statistic like many test so plz tell me how can i communicate to you…plz do tell me
    thanks once again

    Ravi singh rawat

    Reply

  57. Great explanation! Really hit the concept home. Thank you for taking the time to put this together.

    Reply

  58. Great explanation! Thank you so much.
    I wish you could insert an example for calculating df2 for a within-subject design.

    Reply

  59. I love the degrees of freedom explanation. On the two way ANOVA and two way repeated measures ANOVA, I think readers might get more use out of an explanation for the degrees of freedom 1 and degrees of freedom 2 calculations for each main effect as well as for the interaction term. As Emma stated, we see these values bouncing around in the scientific literature all the time.

    Reply

  60. Thank you!! This just saved me for the final I have tomorrow.

    Reply

  61. This finally clarified that pesky “df2″ for my manuscript. Thank you.

    Reply

  62. good work. it helped alot………..:)

    Reply

  63. Thanks so much for such a clear explanation. I would therefore ask you to clear up two additional points in the same clear and lucid manner:
    1. In general how are df’s calculated for various types of regressions – simultanious multiple regression, heirerichial regression (for each step), etc.

    2. In general, why are df’s important – again, a simple general explanation.

    Thanks!

    Reply

  64. Theophil Mlungwa March 7, 2014 at 4:11 PM

    It really clear

    Reply

  65. This explanation did not help me one bit. I am taking an undergrad level of nursing research. I am still just as confused as before.

    Reply

  66. great explanation. thanks a lot.

    Reply

  67. Thank you so much for this post. This is the first time I actually understood the degrees of freedom and I was even able to tell a statistician that he made a mistake :).

    Reply

  68. Best explanation I’ve read so far, but I’m still a little confused as to how to calculate df2 for a rANOVA where I have 3 species and 3 treatments. I can’t get the SPSS output for df2 (error for within-subjects effects) to reconcile with the calculation. I have a sample size of 10 for two species and nine for the third. So I calculate N to be 87, n to be 29, but I don’t know what’s the df for factor. The output df is 52, but I can’t get to it using the formula above. Any ideas? Thanks!

    Reply

  69. I cannot tell you how long I have been looking for a great explanation of this. Thank you for posting this!

    Reply

  70. Thank you for such great explanation. I was looking all over for the difference between df1 and df2. Though im still a bit uncertain and would like to ask a quick question for my manuscript. Im working on tests of independence of row and column in SPSS complex sample module. SPSS output reports Chi square, adjusted F, df1, df2 and sig. The journal requires me to report df for chi square. Not sure which one should I report, df1 or df2 or both? Thank you so much for your time. your help is greatly appreciated. Thank you.

    Reply

    1. The dfs for chi square are not completely the same as those for ANOVA frameworks. Df2 does not exist, but as a second parameter the number of observations is usually reported. For more information, this seems to be a helpful page: http://davidmlane.com/hyperstat/B158107.html

      Reply

      1. Thank you very much. I think the dfs in the SPSS output are for F statistic. i’ll just calculate them by hand. Thank you again!

        An example of SPSS output for test of independence

        http://growingupinscotland.org.uk/wp-content/uploads/2013/05/HandoutComplexSamples2.pdf

        there are df1 and df2

        [Edited by RD: removed previous post and one line in this post regarding spamming my page, which is not a problem. ;)]

        Reply

  71. Hello Ron,

    Thank you for the wonderful explanation!

    I have recently come across a paper which left me quite perplexed on the subject of df in repeated measures and mixed designs. In trying to figure it out, I came across this post – which I will definitely return to, as it is a very helpful guide. :)

    Still, I can’t quite understand how the df were calculated in this paper. I’m not sure if this is the right place to ask, but If you have a few moments, I’d greatly appreciate it if you could give your opinion on this:

    The paper describes a series of studies using a visual search task, in which reaction time is the dependent variable. The data was analyzed using Mixed Effects Anova, on “trial-level data” (which I don’t think I have seen in this context before, nor am I clear on why trial-level data analysis was used here). RTs were not transformed.

    I tried re-creating the calculation of df in one of their studies (experiment 1, which includes only one within-subject variable), but fell short. The reported df are higher than the number of trials multiplied by the number of participants. Could this be possible?

    Furthermore, what I found so strange in this paper is that the reported df were insanely high – yielding very large effects, while the RT differences between conditions were very small at times.

    For example, the second experiment included 96 participants. A 2×2 Mixed Effects Anova revealed a significant effect of the within-subject variable : F(1, 4525) = 11.5, p = .001. But the difference between conditions was merely 15 ms: mean 1 = 645 ms; mean 2 = 660 ms. Unfortunately, Sd’s were not reported.

    My first thought was that the use of trial based analysis inflated the degrees of freedom – which, if I am not mistaken, makes it easier to detect effects. Could this be the case?

    Any input into this matter would be greatly appreciated!

    Thank you,
    Kitt

    Reply

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>