 May 2020

michaelbarrowman.co.uk michaelbarrowman.co.uk

nderlying truth for all time points
Not clear where these come from: should it not be +/0.2 from the above systematic uner/over prediction models? Maybe I am overlooking something

two
three?

.
insert "...of calibrationinthelarge"

.
Will we put the simulation code somewhere (e.g. github)? I'd suggest we do (along with the simulation results).

the realworld
replace "practice"

avoid bias due to a small population size
you mean in terms of the monte carlo SE?

emulate a perfectly calibrated model
replace with "emulate a perfectly specified model"...im not sure calibration is the right term here  even a model that is perfectly specified (like this one) could still be miscalibrated if fitted on data of insufficient size? Of course, in this case n is large, so will indeed be calibrated

as
and?

formalise
replace "investigate"

study
simulation?

censoring needs to be handled in an appropriate way.
We should highlight somewhere that this is the focus of this paper  we begin to tease this out in the last sentence of this paragraph

QRISK
We will need to introduce this  this is where my comment above re. introducing an example early would help

. However
replace with "but" and combine the setences

C
Insert "Nonetheless, censoring ..." to help link the next set of sentences with the rest of this paragraph.

].
Personally, I would suggest we limit the level of focus on QRISK, to avoid this turning into a 'letsslamQRISKpaper'; the problem is wider than this, and we are simply using QRISK as a motivating example. :

.
We should also brielfy outline the others ways of assessing calibration for continuous or binary outcome  i.e. calibration slope, flexible calibration plots.

here
delete

insert "the"

looked at
delete

it starts off at around 50% coverage reaches a peak of full coverage approximately 25% of the way through the timeframe
Very odd behaviour  would we expect this (genuine question)?

the cstatistic is also suitable
replace with "one can also apply IPCW to the cstatistic (a measure of discrimination)"
 also noting that we havent yet introduced the cstatistic or the notion of discrimination: could perhaps do this in first paragraph, when initially talking about validation. I appreciate discrimination is not the main focus of paper, but needs to be defined if mentining it here.

added to
replace with "extended"

...across the full risk range

Clinical prediction models (CPMs)
We should say what they are first. E.g. "Clinical prediction models (CPMs) are statistical models/algorithms that aim to predict the presence (diagnostic) or furture occurence (prognostic) of an event of interest, conditional on a set of predictor variables. Before they be implemented in practice, CPMs must be robustly validated."


michaelbarrowman.co.uk michaelbarrowman.co.uk

multiple outcomes
As above. Suggest replace with "multidimentional" as you did in abstract.

recommendations of Riley et al [22], whose calculations produce a requirement of 4.54 EPP
...but Riley et al.'s formula move away from EPP  I dont follow this.

1.32, 0.01, 0.00 0.00, 1.37, 0.01 0.00, 0.00, 1.3
We need to get the CI around these (especially the diagonals) to see if 1 is included. Is there a way to do this (I presume so).

average scores remained strong
Can you pick out a few examples: e.g. "overall the threestate model was well calibrated in both internal validation and external validation, with calibration intercept (a.k.a. calibrationinthelarge) sufficiently close to zero, and a calibration slope sufficiently close to 1. " etc. etc.

All measures degraded over tim
In terms of prediction horizon, you mean? In other words, the two year column in tab 6.9 is showing the performance of the models at predicting state probabilities two years after prediction time?

I’m still not sure the best way to do this (aesthetically).
Why cant you simply stack the rows, and indicate if the cells (i.e. numbers) are means, range, missing, etc. or frequencies and percentages, accordingly in the footnote or next to variable name. Alternatively, just add an empty row to the table to separate the continuous variables from categorical ones, and footnote the table to explain.

.
See what Mark thinks, but seems to me that this intro needs to focus heavily on the clinical angle. For example: "Many clinical prediction models (CPMs) have been developed to predict individual risk of different outcomes following chronic kidney disease, but few allow the ability to predict the risk of patients transitioning between likely clinical pathways post CKD onset and death. For example, the risk of having a transplant within 1 year following dialysis, or the risk of remaining on dialysis until death. Multistate models provide the vehicle to make such predictions, but have not been used within the CKD context"
....or something like this (I am sure you can do much better than my 5 min attempt!!).

Models performed well in model validation with the ThreeState Model slightly out performing the other two models in calibration and overall predictive ability
As per previous  readers currently dont know this because it wasnt talked about in results section (edit yes, in supplements, but it needs signposting at least)

Table 6.3 shows a breakdown of the categorical variables
It would be conventional to combine these tables  i.e. the classic Table 1. I presume you have done it like this for coding reasons, but for the final editting we should combine.

As part of this work, we also intend to produce an online calculator to allow patients and clinicians to easily estimate outcomes without worrying about the mathematics involved
I have recently been chatting with another PhD student in our centre (Videha Sharma) who is looking to use the GM Local Health and Care Record Exemplar to integrate the Tangrie model into the clinical system and thereby provide the predicted risks automatically. I suggessted that you might both wish to talk at some point to explore scope of incorperating your model into Videha's work. Can connect you if needed.

Results
I appreciate word limit, and I would suggest more specific results if we can, particularly for a clinical paper. For example, can we show the predictive performance metrics here? Perhaps take some of the words from the methods section, which is currently a significant part of the abstract.

there are far more males than females
For both this comparison and those above for table 6.2, the discussion of "similarity" is based on eyeballing the absolute magnitude of the variables, rather than any formal test  is that right? If so, we should make it clear in the wording (e.g. crude proportions of males were numerically higher than females...).

without a loss to the quality of that prediction
I dont follow this?
Tags
Annotators
URL

 Apr 2020

michaelbarrowman.co.uk michaelbarrowman.co.uk

whether internal or external
I've said this before, but I think we need to be prepared for reviewers asking us to do this in our data to see how much "better" your model is compared with these existing ones (particulalry the Grams model, which is most closely aligned with your model i.e. both MSM of some form). I realise none of them model the full transitions like you do, but some of the existing models map to your transitionspecific models, so could theoretically be compared. I dont think this needs addressing here, but worth considering a response for reviewers (and more importantly your viva!!)


michaelbarrowman.co.uk michaelbarrowman.co.uk

bad results
??
