Below, I outline why the reviewers were mostly wrong (there is an effect, but it is tiny) and how we can control for the effect of age group. on the regression. These answers were obtained via simulations of the problem in Matlab.
Do between-group differences in X and Y affect the correlation?
Because there is no correlation between X and Y, the number of false positive should be 5% if our significance threshold is 5%. Simulations showed that the percentage of false positive was 6.8%, which means that the mean differences in X and Y between groups lead to more significant correlations than it should.
Solution 1: Set the significance threshold to 0.0368 (0.05*(0.05/0.068)) and then is false positive rate again 5%.
However, the correction of the significance threshold is related to the estimated effect size of the mean differences for X and for Y. This is thus specific to each correlation. Therefore, we looked for a more general solution:
Solution 2: multiple regression.
Can multiple regression partial out the effect of age group?
We used the following model: Y = A + B*X + C*G + D*XG where G is the categorical variable linked to the groups and XG is the interaction term between X and G.
When using this model, we found that, in the absence of relationship between Y and X, the coefficient B and D had a false positive rate of 5%, which matches our significance threshold.
The coefficient C was significant more frequently because of a difference in Y between groups (effect size: d=0.2). Therefore, using this regression model allows us to partial out the effect of age group on the relationship between X and Y.