I'll explain why I call your article complete nonsense. It's not the study that's the problem. The data itself is likely fine. The problem is that it doesn't support the conclusion you're trying to draw. First we'll deal with Fig 1. Here is your conclusion:
"Based on the data found in figure 1, we reach the conclusion that weekly mileage does have an influence on performance in that increases in weekly mileage can result in improved performance. However, the influence of weekly mileage is not strong, predicting just 21% of finishing time. We can also conclude, based on the leveling of the relationship curve, that continually increasing weekly mileage does not cause continual improvements in race finishing time. There appears to be an upper limit to improvements due to increases in weekly mileage. Finally, this data shows that, on average, the relationship between increasing weekly mileage and race performance levels off at about 50 mpw. This leveling occurs at a significantly lower weekly mileage than that recommended by those who use the training mileage of elite athletes as their guide."
1. You fail to point out that of all the variables studied, weekly mileage was best predictor of finishing time.
2. The data only goes up to 100 Km (62 miles) per week. Its slope is negative the entire way, which should lead you to conclude that mileage does continually better performance. 62 mpw is also nowhere near the upper bound for weekly mileage that athletes typically run. A very significant demographic has been inexplicably excluded from the study.
3. For some reason, you've used a scale that begins at zero even though the lower bound for the race is 49:19. Your inappropriate choice of scale grossly exagerates the flatness of the curve. You suggest that it levels off, but it doesn't, you've only made it look that way by choosing a wide scale.
4. You make the mistake of confusing corellation and causation.
5. This is the main one here. You've exluded what appears to be quite a bit of data. The lower and upper bounds for the race are 49:19 and 1:57:18, but the graph ends at about 58:00 and 1:15. What about the rest???? The total range of finishing times is over 1 hour, but the range you've used in the graph appears to be <20 min. Without seeing the data set, it's safe to say you've excluded a huge amount of data points. Why??
Fig 2 is even worse. Once again, your conclusions:
"This data quite compellingly shows that increasing mileage does not benefit all runners equally. By correlating weekly mileage and race performance for 3 distinct groups of runners rather than simply taking the average of all 4000+ runners it becomes clear that the relationship between weekly mileage and performance is not the same for all runners. The correlations lead us to conclude that some runners benefit more from increasing weekly mileage than do others. We also see that the benefit of increasing mileage levels off much sooner for some runners than others and at a much lower weekly mileage than is suggested by many as the optimal weekly run mileage. For some runners increasing mileage beyond a certain point not only doesn?t produce additional improvements but actually causes a decline in performance."
1. Once again, you've confused corellation and causation.
2. You've ignored huge amounts of data... again. The study includes athletes that run up to 100 Km per week, yet for purposes of this chart, you've truncated the data at 60 Km per week. Why?
3. What does this mean? Are there runners in each of the three groups who train less than 10 Km per week, which is essentially no running? It is virtually inconceivable that an individual who does not train could finish a 10 mile run in <1:06, and if such a person does exist, they would be an outlier, and should not be usable in your study. Therefore, I will conclude that when you say % change in finishing time, you mean from the average finishing time of non-runners. Interestingly, all regression lines intersect the axis at this point. How is that possible?? The slowest group would presumably intersect the axis near this point, since virtually all of the non-runners would be in this group. For the two faster groups, it would be impossible for them to intersect here. By virtue of being in the faster group, every runner is faster, and so every point would have a non-zero percent difference. You see where I'm going with this don't you? The chart is simply impossible. It's a complete fabrication.
Bottom line is, your first chart may be an accurate representation of part of the data set (although I doubt it), but you've presented it in a way that is extremely misleading, and it's only part of the set. Likely the portion that came closest to supporting the conclusions you wanted to draw. The second chart is a flat out lie. I believe you either hand-picked data points that would produce curves in the shape you wanted, or you simply drew it free-hand.