At the end of the study neither group was able to identify accurately which treatment period was thyroxine or placebo (table (table4).4).
Fascinating. At 100 mcg, I'd expect one to be able to tell the difference. This is especially surprising given the two groups. Namely, the symptomatic group and the healthy control group. I'd expect at least one group to be able to tell the difference. However, it's worth noting that the TSH between the two groups were virtually identical. The groups were selected based on hypothyroid symptoms rather than actual thyroid status.
The fact that healthy controls could not tell the difference is odd. It is both not what I expected and cuts against what the related lesswrong article says about discernibility to healthy subjects (odd given that lesswrong mentions this specific study). What may be happening is that the question was not specific enough. If subjects interpreted reduced vitality as a sign that they were not receiving thyroxine, then many might get is wrong. I wish they has a more thorough questionnaire on perception of drug effects. Even an informal "describe the experience" question would be nice.
All in all, I find it unlikely that this is accurate. There are at least two possibilities. It could indicate that the dose was not high enough, in which case the study is not testing what it thinks it's testing. That is supported the the fact that the TSH of the thyroxine group is barely out of the normal range. The other option, as mentioned above, is that they are not asking the right questions. If that is the case, that also seems to invalidate the findings.
In conclusion, this result means that we can't trust the study. I'm certain higher doses would be discernable from placebo to subjects. It is likely that it was discernible at this dose (100 mcg) if the right questions were asked. I'm thankful they included this outcome.
P.S. I found this study completely by accident, then found out it was related to that lesswrong article that I've always appreciated.