📦 mcguinlu / thesis

📄 03-Chp-Systematic-Review-Methods.Rmd · 388 lines
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388---
bibliography: bibliography/references.bib
csl: bibliography/nature.csl
output:
  bookdown::pdf_document2:
    template: templates/brief_template.tex
  bookdown::word_document2: 
      toc: false
      toc_depth: 3
      reference_docx: templates/word-styles-reference-01.docx
      number_sections: false 
  bookdown::html_document2: default
documentclass: book
---
```{block type='savequote', include=knitr::is_latex_output(),quote_author='(ref:sys-rev-methods-quote)', echo = TRUE}
It is surely a great criticism of our profession that we have not organised a critical summary by speciality or sub-speciality, up-dated periodically, of all relevant RCTS.  
```

(ref:sys-rev-methods-quote) --- Archibald Cochrane, 2000[@cochrane1979]

# Systematic review [and meta-analyses]{.correction} of existing evidence on the association between blood lipids and dementia outcomes: Methods {#sys-rev-methods-heading}

&nbsp;

\minitoc <!-- this will include a mini table of contents-->

```{r, echo = FALSE, warning=FALSE, message=FALSE}
source("R/helper.R")
knitr::read_chunk("R/03-Code-Systematic-Review.R")
doc_type <- knitr::opts_knit$get('rmarkdown.pandoc.to') # Info on knitting format
```

```{r prisma-flow-setup, include = FALSE}
```

<!-- IDEA Stratify by _Apo_$\mathcal{E}4$ genotype -->  

<!-- TODO Gib Hermani question on bias in selection of AD patients in Ostergaard -->

::: {.laybox data-latex=""}
## Lay summary {-}

Systematic reviews are a type of research study that aim to collect and combine all existing evidence to provide the best possible answer to a research question. Well-performed reviews involve multiple steps including: searching for existing studies; assessment of the studies against predefined inclusion criteria; collection of data from each study; and assessment of each study's method and results. 

&nbsp;

This chapter presents the methods used to perform a systematic review of primary studies that have examined the relationship between the levels of blood lipids (such as cholesterol and triglycerides) and dementia outcomes. In addition, the review included studies that examined the relationship between treatments that change blood lipid levels, such as statins, and dementia outcomes. The results of this systematic review are presented in Chapter \@ref(sys-rev-results-heading).
:::

&nbsp;<!----------------------------------------------------------------------->  

## Introduction {#sys-rev-intro}

In this chapter, I describe a comprehensive systematic review of the relationship between blood lipid levels (and treatments that modify them) and the subsequent risk of dementia and related outcomes. 

This analysis sought to address two specific aims. Firstly, as discussed in the introduction to this thesis (Section \@ref(evidence-association)), several diverse forms of evidence on the relationship between lipids and dementia exist. These include randomised controlled trials, observational studies of different design, and Mendelian randomisation studies. However, based on a scoping review of existing literature, no previous evidence synthesis exercise has attempted to examine the association of lipid levels with dementia outcomes across these distinct evidence types. Collating these diverse evidence sources is important because, if the observed association between lipids and dementia is constant across them, it increases our confidence in the association. As such, the primary aim of this analysis was to systematically review all available literature describing prospective analyses, regardless of study design.

Secondly, I explicitly sought to include health-related preprint servers as a potential evidence source in this review, as they are infrequently considered by evidence synthesists but can report relevant unpublished analyses. As a sensitivity analysis to the review presented in this chapter, I sought to quantify the additional evidential value of relevant preprints, making use of the preprint search tool presented in Chapter \@ref(sys-rev-tools-heading).

Given the size of the review, I have split its reporting across two separate chapters for ease of reading. This chapter details the methodology used, while Chapter \@ref(sys-rev-results-heading) presents the results.

&nbsp;
<!----------------------------------------------------------------------------->

## Methods

### Protocol

A pre-specified protocol for this analysis was registered on the Open Science Framework platform and is available for inspection.[@mcguinnessluke2020] Deviations from the protocol are summarised in Appendix \@ref(appendix-sys-rev-deviations).

&nbsp;<!----------------------------------------------------------------------->  

### Contributions

In line with best-practice guidance, secondary reviewers were used to check the accuracy of screening, data extraction and risk-of-bias assessment processes. Due to the scale of the project, this review was performed in conjunction with a team of secondary reviewers (see Acknowledgements in the front matter).

<!----------------------------------------------------------------------------->
&nbsp;

### Eligibility criteria

#### Inclusion criteria 

I sought to include studies that examined blood lipid levels as a risk factor for dementia outcomes, defined either as a binary hypercholesterolemia variable or by category/1-standard-deviation increase in a specific lipid fraction. The fractions considered by this review were namely total cholesterol (TC), high-density lipoprotein cholesterol [(HDL-c)]{.correction}, low-density lipoprotein cholesterol [(LDL-c)]{.correction} and triglycerides (TG). I also aimed to include studies examining the effect of lipid-regulating agents (LRA) as a source of indirect evidence. LRA are treatments used to modify blood lipids levels, and common examples include statins, fibrates, and ezetimibe. 

Eligible study designs were randomised controlled trials of LRA, cohort studies examining blood lipids or LRA, and genetic instrumental variable studies (more commonly known as Mendelian randomisation studies) examining the effect of genetically increased/decreased blood lipid levels. 

Eligible studies screened participants for dementia at baseline and excluded any prevalent cases. Alternatively, where no baseline screening was employed, participants were assumed to be dementia free if less than 50 years of age at baseline. Eligible studies defined dementia outcomes according to recognised criteria. Examples of acceptable criteria included the International Classification of Diseases (ICD),[@organizationwho1993] the National Institute of Neurological Disorders and Stroke Association-Internationale pour la Recherche en l'Enseignement en Neurosciences (NINDS-AIREN),[@roman1993] and the Diagnostic and Statistical Manual of Mental Disorders (DSM) criteria.[@edition2013] Studies utilising electronic health records were an exception to this, as it was assumed that health care professionals employed their own expertise when entering the outcome into the EHR.

[Longitudinal]{.correction} studies [(randomised trials and observational studies)]{.correction} of any duration were included, and no limits were placed on the sample size of included studies. Finally, conference abstracts with no corresponding full-text publication were eligible, and where required, I contacted authors to obtain relevant additional information not available from the abstract. No limitations were imposed on publication status, date, venue or language.

<!-- TODO Any study using EHR - go back and extract codelists -->

<!----------------------------------------------------------------------------->
&nbsp;

#### Exclusion criteria

Due to the significant impact of a memory-related outcome such as dementia on exposure recall, case-control studies were excluded, though case-control studies where historical records are used to determine the exposure status were eligible for inclusion. Cross-sectional studies, qualitative studies, case reports/series and narrative reviews were also excluded, as were studies that measured change in continuous cognitive measures (e.g., Montreal Cognitive Assessment (MoCA) score) without an attempt to map these scores to ordinal groups (e.g., no dementia/dementia).[@tennant2021] Previous systematic reviews were not eligible for inclusion, but their reference lists were screened to identify any potentially relevant articles. 

Studies with outcomes not directly related to the clinical syndrome of dementia (e.g.,, neuroimaging) were excluded. Additionally, studies implementing a "multi-domain intervention" where a lipid-regulating agent is included in each arm were excluded. For example, a study examining exercise + statins versus statins alone would be excluded, but a study examining exercise + statins vs exercise alone would be eligible as the effect of statin use can be estimated. Finally, studies using a dietary intervention, for example an omega-3 fatty acid enriched diet, were excluded as it is difficult to disentangle the effect of other elements contained within the diet. Note that this is distinct from studies which delivered a simple tablet-based omega-3 intervention, which would have been eligible for inclusion.

<!----------------------------------------------------------------------------->
&nbsp;

### Information sources and search strategy

I systematically searched several electronic bibliographic databases to identify potentially relevant entries (hereafter referred to as "records"). The following databases were searched in June 2019 from inception onwards: MEDLINE, Embase, PsychINFO, Cochrane Central Register of Controlled Trials (CENTRAL), and Web of Science Core Collection. Given that the contents of the Web of Science Core Collection can vary by institution,[@gusenbauer2020] the specific databases and date ranges for each database searched via this platform are listed in Appendix \@ref(appendix-wos-databases).

The search strategy used in each database was developed in an iterative manner using a combination of free text and controlled vocabulary (MeSH/EMTREE)[@lefebvre2019searching] terms, incorporating input from an information specialist. The strategy included terms related to lipids, lipid modifying treatments, and dementia, and was designed for use in MEDLINE before being adapted to the other bibliography databases listed. A high-level outline of the strategy is presented in Table \@ref(tab:searchOverview-table) below and the full search strategies for each database are presented in Appendix \@ref(appendix-search-strategy). <!-- TODO Need to actually attach each search strategy. Should be able to loop through search strategy results. -->

&nbsp;

<!----------------------------------------------------------------------------->
(ref:searchOverview-caption) __Summary of systematic search by topic__ - The full search strategy for each databases searched, including all search terms and the number of hits per term, is presented in Appendix \@ref(appendix-search-strategy).

(ref:searchOverview-scaption) Summary of systematic search by topic

```{r searchOverview-table, message=FALSE, results="asis", echo = FALSE}
```
<!----------------------------------------------------------------------------->

&nbsp;

When searching the bibliographic databases, study design filters (Lines 9-11 in Table \@ref(tab:searchOverview-table)) were employed to try to reduce the screening load. I confirmed that the study design filters were not excluding potentially relevant records by screening a random sample of 500 records identified by the main search but excluded by the filters (defined as "8 NOT 12" in Table \@ref(tab:searchOverview-table)).

I also searched clinical trial registries, for example ClinicalTrials.gov, to identify relevant randomised controlled trials. In addition, I searched the bioRxiv and medRxiv preprint repositories using the tool developed in Chapter \@ref(sys-rev-tools-heading) to identify potentially relevant preprinted studies (see Appendix \@ref(appendix-medrxivr-code) for the code used to search these preprint repositories).

Grey literature was searched via ProQuest, OpenGrey and Web of Science Conference Proceedings Citation Index, and theses were accessed using the Open Access Theses and Dissertations portal. In addition, the abstract lists of relevant conferences (e.g., the proceedings of the Alzheimer's Association International Conference, published in the journal Alzheimer's & Dementia) were searched by hand. Finally, the reference lists of included studies were searched by hand, while forward and reverse citation searching ("snowballing") was performed using Google Scholar.[@greenhalgh2005; @wohlin2014]

&nbsp;
<!----------------------------------------------------------------------------->

### Study selection

Records were imported into Endnote and de-duplicated using the method outlined in Bramer et al. (2016).[@bramer2016] In summary, this method uses multiple stages to identify potential duplicates. The initial step involves automatic deletion of records matching on multiple fields ("Author" + "Year" + "Title" + "Journal"), and is followed by manual review of articles with less overlap (e.g., those identified as duplicates based on the "Title" field alone).

Following de-duplication of records, screening (both title/abstract and full-text) was performed using a combination of Endnote and Rayyan. Endnote is a citation management tool,[@hupe2019] and Rayyan is a web-based screening application.[@ouzzani2016] Title and abstract screening to remove obviously irrelevant records was performed, with a random ~10% sample of excluded records being screened in duplicate to ensure consistency with the inclusion criteria. Additionally, I re-screened the same ~10% sample with three-month lag to assess intra-rater consistency.

Similarly, I completed all full-text screening, with a random ~20% being screened by a second reviewer. Additionally, any records I identified as being difficult to assess against the inclusion criteria were also screened by the second reviewer. Reasons for exclusion at this stage were recorded. Disagreements occurring during either stage of the screening process were resolved through discussion with a senior colleague. A PRIMSA flow diagram was produced to document how records moved through the review.[@page2021]

<!----------------------------------------------------------------------------->
&nbsp;

### Validation of screening process

Inter- and intra-rater reliability during the screening stages were assessed for a 10% sub-sample of records. Intra-rater reliability involved a single reviewer applying the inclusion criteria to the same set of records while blinded to their previous decisions (i.e., assessment of consistency), while inter-rater reliability involved two reviewers independently screening the same set of records (i.e., assessment of accuracy).

Rater reliability was assessed using Gwet's agreement coefficient ($AC1$).[@gwet2008] This approach was chosen over other measures such as percent agreement because it accounts for chance agreement between reviewers but does not suffer from bias due to severely imbalanced marginal totals in the same way that Cohen's $kappa$ value does. [@cohen1960: @gwet2008; @wongpakaran2013] Given the small number of included studies in this review as a proportion of the total number screened, this is a useful characteristic. The formulae used to calculate $AC1$ are given in Appendix \@ref(appendix-gwet). 

Interpretation of agreement coefficients is a widely debated topic, and while arbitrary cut-off values may mislead readers,[@brennan1992] they provide a useful rubric by which to assess inter-rater agreement. Here, I used guidelines based on a stricter interpretation of the Cohen's $kappa$ coefficient,[@mchugh2012] presented in Table \@ref(tab:gwet-table).

&nbsp;

<!----------------------------------------------------------------------------->
(ref:gwet-caption) __Ranges for Gwet's $AC1$__ - To aid the interpretation of inter-/intra-rater reliability, I used a published set of suggested categories for Gwet's AC1.[@mchugh2012]

(ref:gwet-scaption) Ranges for Gwet's $AC1$

```{r gwet-table, message=FALSE, results="asis", echo = FALSE}
```
<!----------------------------------------------------------------------------->

&nbsp;

Intra- and inter-rater reliability was assessed against these cut-offs. If this assessment demonstrated an issue with the screening process (defined as an $AC1$ of less than 0.9), it indicated the need for a larger proportion of records to be dual screened.

&nbsp;
<!----------------------------------------------------------------------------->

### Data extraction

Data extraction was performed using a piloted data extraction form. I extracted all data in the first instance, which was subsequently checked for accuracy by a second member of the review team. Extracted items included:

* Article metadata: year of publication, author, journal
* Study characteristics: study location, data source, exposure, outcomes, diagnostic criteria used
* Patient characteristics: age, sex, baseline cognition scores, baseline education scores
* Results: exposure, outcome, effect measure, effect estimate, error estimate, p-value 

&nbsp;<!----------------------------------------------------------------------->  

#### Grouping multiple reports into studies

<!-- COMMENT [Yoav] How did you choose what data to include? re: studyification -->

<!-- COMMENT [Julian] Same analysis or same data? Don't want 2+ analyses of the same data to appear in a meta-analysis (unless orthogonal; e.g., MR + RCT) -->
<!-- QUESTION Why does being orthogonal make it okay? And the Three City Study is likely to be included several times -->

As part of the data extraction process, multiple records resulting from a single analysis were included and grouped into single units, hereafter referred to as studies. This was most common in cases where multiple records reported on the same analysis but at different time points. In this case, the result corresponding to the longest follow-up was used. Grouping records into studies builds out the most comprehensive account of a given analysis by incorporating information from all available records. 

This was particularly relevant to preprints and published papers reporting the same study, which were not considered to be duplicate records but instead different reports of the same study. This decision was taken due to the potential for the published version to offer some information that the preprint did not, and vice versa.

<!----------------------------------------------------------------------------->
&nbsp;

#### Combining across groups

In line with best practice, where summary data was presented across two groups (e.g., age at baseline stratified by hypercholesterolemia status), the following approach was used to combine the groups:[@higgins2019]

\begin{equation}
N = N_1 + N_2
  (\#eq:combiningGroups1)
\end{equation}


\begin{equation}
Mean = \frac{(N_1M_1 + N_2M_2)}{(N_1 + N_2)}
  (\#eq:combiningGroups2)
\end{equation}


\begin{equation}
SD = \sqrt{\frac{(N_1-1)SD_1^2 + (N_2-1)SD_2^2 + \frac{N_1N_2}{N_1 + N_2}(M_1^2 + M_2^2 - 2M_1M_2)}{N1 + N2 -1}}
  (\#eq:combiningGroups3)
\end{equation}
&nbsp;

where $N_i$, $M_i$ and $SD_i$ denote the number of participants, mean and standard deviation in the $i$th subgroup, respectively. This approach was implemented in a systematic manner, with the raw group data being extracted and a cleaning script employed to combine the groups for analysis.

<!----------------------------------------------------------------------------->
&nbsp;


#### Harmonisation of cholesterol measures

Where necessary, lipid levels reported in _mmol/L_ were converted in _mg/dL_ using the following formula:

\begin{equation} 
  mg/dL = mmol/L \times{} Z
  (\#eq:lipidConversion)
\end{equation} 

where $Z = 38.67$ when converting total cholesterol, LDL-c and HDL-c measures, and $Z = 88.57$ when converting triglyceride measures.[@rugge2011] The choice of _mg/dL_ was influenced by the widely-used categories of lipids levels on the _mg/dL_ scale, as shown in Table \@ref(tab:lipidLevels-table) in Section \@ref(intro-lipid-fractions).

<!----------------------------------------------------------------------------->
&nbsp;


#### Following up with authors {#contacting-authors}

Where additional data not reported by a study were required either for data extraction or risk-of-bias assessment, the corresponding author of the study was contacted. This approach was taken due to the potentially large impact of additional information obtained through contact with study authors on the results of the review.[@reynders2019] 

&nbsp;<!----------------------------------------------------------------------->  

### Risk-of-bias assessment {#risk-of-bias}

A key aim of the review presented here is to identify different sources of evidence at risk of a diverse range of biases, and to contrast and compare findings across them (see Section \@ref(intro-triangulation) for an overview of triangulation and Chapter \@ref(tri-heading) for the results of this analysis). To enable this triangulation exercise, detailed and structured risk-of-bias assessments formed an important part of this review.

There has been a recent movement within the evidence synthesis community away from examining _methodological quality_ towards assessing _risk of bias_,[@mcguinness2018; @sterne2016] and thus directly evaluating the internal validity of a study. Internal validity is defined here as the absence of systematic error (or bias) in a study, which may influence its results.[@campbell1957; @juni2001] 

This move was prompted by an unclear definition of methodological quality which could include facets such as unclear reporting, in addition to challenges in the comparison of results from different tools. As part of this shift, the focus moved from checklists and score-based tools towards domain-based methods, in which different potential sources of bias in a study are assessed in order. Additionally, the new tools move from assessing bias at the study level to considering each individual numerical result separately.  For example, a study may report on the efficacy of an intervention at six months and two years follow-up. In this case, the proportion of missing outcome data may not be an issue at six months but may introduce bias after two years of follow-up. In this case, assigning a single risk-of-bias judgement to the study as a whole would mask the different biases applicable to each unique result.

In this review, domain-based tools were used to assess the risk of bias for each result in each included study. The study-design-specific tools are introduced and discussed in more detail in the following sections. 

<!----------------------------------------------------------------------------->
&nbsp;

#### Randomised controlled trials

Randomised controlled trials were assessed using the RoB 2 tool.[@sterne2019] This tool assesses the risk of bias across five domains:

* bias arising from the randomisation process, 
* bias due to deviations from intended intervention, 
* bias due to missing outcome data, 
* bias in measurement of the outcome, 
* bias in selection of the reported result. 

Acceptable judgements for each domain include: "low risk", "some concerns", "high risk". Each of the five domains contains a series of signaling questions or prompts which guide the user through the tool. Once a domain-level judgement for each domain has been assigned, an overall judgement, using the same three levels of risk of bias, is assigned to the result.

<!----------------------------------------------------------------------------->
&nbsp;

#### Non-randomised studies of interventions/exposures {#rob-tools-nrse}

For non-randomised studies of interventions (NRSI), I used the ROBINS-I (Risk Of Bias In Non-randomised Studies - of Interventions) tool.[@sterne2016] This tool assesses the risk of bias across seven domains: 

* bias due to confounding, 
* bias due to selection of participants, 
* bias in classification of interventions, 
* bias due to deviations from intended interventions, 
* bias due to missing data, 
* bias in measurement of outcomes, and 
* bias in selection of the reported result. 

Similar to RoB 2, it has a number of signaling questions which inform the domain-level judgement, with acceptable judgements including “low”, “moderate”, “serious” and “critical”. In the context of the tool, observational studies are assessed in reference to an idealised randomised controlled trial. Under this approach, the (rare) overall judgement of "low" indicates that the results should be considered equivalent to that produced by a randomised controlled trial.

While a risk-of-bias tool for non-randomised studies of exposures (NRSE) is currently under development,[@morganr2020] it was insufficiently developed at the time the risk-of-bias assessments were performed. Instead, I used a version of the ROBINS-I tool informed by the preliminary ROBINS-E tool (Risk of Bias In Non-randomised Studies – of Exposure), which I had previously applied in a published review.[@french2019] The version had no signaling questions and so judgements, using the same four levels of risk of bias as ROBINS-I, were made at the domain level. The motivation for using this tool above other established tools - such as the Newcastle-Ottowa scale (NOS)[@wells2000] - was two-fold. In the first instance, as mentioned in the introduction to this section, using a domain-based tool has distinct advantages over better-developed checklist-type tools including the NOS. Secondly, using a domain-based tool for non-randomised studies of exposures enabled better comparison with risk-of-bias assessments performed for the other study designs considered by this review.

<!----------------------------------------------------------------------------->
&nbsp;

#### Mendelian randomisation studies

At present, no formalised risk-of-bias assessment tool for Mendelian randomisation studies is available. Assessment of the risk of bias in Mendelian randomisation studies was informed by the tool developed by Mamluk _et al_ for use in their own review.[@mamluk2020] This tool was identified through a meta-review of risk-of-bias assessments in systematic reviews of Mendelian randomisation studies,[@spiga2021] advanced results of which were obtained through personal communication with the authors. A copy of this tool is available in Appendix \@ref(appendix-mr-rob), but in summary, results were assessed for bias arising from weak instruments, genetic confounding, other confounding, pleiotropy and population stratification. Acceptable judgements for each of the five domains in the tool were "low", "moderate" and "high" risk of bias.

<!----------------------------------------------------------------------------->
&nbsp;

#### Risk of bias due to missing evidence {#methods-rob-me}

In addition to assessing the risk of bias within each result contributing to a synthesis, I also assessed risk of bias due to missing evidence at the synthesis level. This assessment examines evidence missing due to selective non-reporting as opposed to the selective reporting of a single result from multiple planned analyses. The assessment was performed using the forthcoming RoB-ME (Risk of Bias due to Missing Evidence in a synthesis) tool.[@zotero-15123] The tool is in development stages, and as part of this review, I piloted the tool and provided feedback to the developers.

<!----------------------------------------------------------------------------->
&nbsp;

### Analysis methods

#### Analysis overview {#sys-rev-analysis-overview}

An initial qualitative synthesis of evidence was performed, summarising the data extracted from studies stratified by study design. Where individual studies were deemed comparable, they were incorporated into a quantitative analysis or "meta-analysis". Results were not combined across different study designs (i.e., RCTs were not combined in a meta-analysis with results from observational studies). The summary effect estimates produced across individual study designs are discussed but are compared and contrasted more fully as part of the triangulation exercise presented in Chapter \@ref(tri-heading). Similarly, analyses are presented separately for each dementia outcome of interest (all-cause dementia, Alzheimer's disease, vascular dementia). [A final division of analyses was made based on the reported exposure measurement (e.g. binary measurement of hypercholesterolemia, standard deviation change in lipid fraction, etc.).]{.correction}

The range of effect measures presented by studies (odds ratios, risk ratios, hazard ratios, etc.) are not directly interchangeable in the context of systematic review.  As such, different reported effect estimates can be one potential problem that precludes a meta-analysis of all studies.[@mckenzie2019] However if the outcome is rare [(<10%)]{.correction}, as is the case for dementia outcomes, the estimated odds, risk and hazard ratios will approximate each other.[@vanderweele2020] As such, I did not stratify the analyses by reported effect measure.
 
&nbsp;<!----------------------------------------------------------------------->  

#### Random-effects meta-analysis {#meta-analysis-methods}

<!-- TODO May need to talk about FE models here, as I refer to this section for a broader discussion of meta-analysis in the IPD chapter -->

I used a random-effects model in all meta-analyses.[@hedges1998;@dersimonian1986] Random-effects meta-analysis does not assume one true underlying effect, but rather allows for a distribution of true effects with variance $\tau^2$. The result in the $i$th study is denoted as $y_i$. The weight ($w_i$) assigned to this result is then given as the inverse of the variance of that result ($v_i$) plus the estimate of between-result variance ($\tau^2$):

\begin{equation}
w_i = \frac{1}{v_i+\tau^2}
  (\#eq:ivweighting)
\end{equation}

Once the weights are calculated for each result, the overall estimate ($\hat{y}$) and variance ($Var(\hat{y})$) can be estimated:

\begin{equation}
\hat{y} = \frac{\sum{y_iw_i}}{\sum{w_i}}
  (\#eq:rmaEstimate)
\end{equation}

\begin{equation}
Var(\hat{y}) = \frac{1}{\sum{w_i}}
  (\#eq:rmaVariance)
\end{equation}

Results included in each meta-analysis were stratified into subgroups on the basis of the overall risk-of-bias judgement. Summary estimates for each subgroup, in addition to an overall effect estimate, are displayed in each forest plot. Additional descriptive statistics are presented, while prediction intervals are shown as a dotted line banding the overall effect estimate. Finally, where at least 10 results are available, a test of subgroup differences between studies at different levels of risk of bias was performed (see subsequent Section \@ref(sys-rev-visualising-results)).[@deeks2019] All models were implemented using the `metafor` R package.[@R-metafor]

&nbsp;<!----------------------------------------------------------------------->  

#### Dose-response analyses

Several of the included studies presented data on multiple categories of lipid levels in addition to an overall effect estimate based on a comparison of only two of these categories (e.g., for example, highest vs lowest quartile). While this two-category comparison allows for easy interpretation of the resulting effect estimate, it ignores any potential non-linear relationships between the exposure and outcome, in addition to discarding useful information contained in the interim groups. In order to address this limitation, I performed a dose-response meta-analysis in those studies reporting more than two categories of lipid levels.

Studies were excluded from this analysis if the number of categories was less than three or if the necessary information for synthesis (cut-off points, number of participants and number of events per category) was not available. When the highest reported category was open ended (e.g., LDL-c $\geqslant$ 200 mg/dL), I calculated the category midpoint by assuming the width of the highest category was the same as the one immediately below it. Similarly, when the lowest category was open-ended (e.g LDL-c $\leqslant$ 100 mg/dL), I set the lower boundary for this category to zero (though this is unlikely to occur naturally).

I took a two-stage dose-response meta-analysis approach, where study-specific trends are estimated before being pooled across studies.[@greenland1992] To estimate within-study trends, a restricted cubic spline model was fitted within each study. This approach allows for a non-linear relationship between lipid levels and dementia, for example a U or J-shaped relationship, where low and high levels of a lipid fraction can have different effects versus the "normal" reference dose.[@durrleman1989;@liu2009] Reference doses were defined _a priori_ as the cut-off of the "Normal"/"Optimal" categories for each fraction, as detailed in Table \@ref(tab:lipidLevels-table). Under this approach, the reference dose was defined as 200 mg/dL for total cholesterol, 100 mg/dL for LDL-c, 40 mg/dL for HDL-c, and 150 mg/dL for triglycerides. 

Restricted cubic spline models use "knots" to separate the data into subsets or "windows", in which the trend is modelled. Further restrictions are then imposed, in that the curves in each window must join up "smoothly" at the knot location, and that the trend is assumed to be linear before the first knot and after the last knot.[@gauthier2020] The locations of the knots in the model were identified using fixed percentiles (5th, 50th, 95th) of the exposure data.[@durrleman1989] Once the within-study trends had been estimated, they were combined in a multivariate random-effects meta-analysis using the restricted maximum likelihood (REML) method.[@liu2009; @white2009;@gasparrini2012] All dose-response analyses were implemented using the `dosresmeta` R package.[@crippa2016]

&nbsp;
<!----------------------------------------------------------------------------->

#### Additional analyses

Where there was evidence of heterogeneity between results included in a meta-analysis, I investigated this further using meta-regression against reported characteristics. _A priori_, I was interested in the effect that the age at baseline, sex and risk-of-bias judgement had on the results. Syntheses with greater than 10 results were assessed for heterogeneity across these covariates.[@deeks2019] I also investigated the potential for small study effects (of which publication bias is one potential cause) in syntheses with greater than 10 results, both visually using funnel plots and formally using Egger's regression test.[@sterne2011]

&nbsp;
<!----------------------------------------------------------------------------->

#### Visualisation of results {#sys-rev-visualising-results}

Evidence maps are a useful way to explore the distribution of research cohorts included in a systematic review.[@saran2018] As part of the initial descriptive synthesis, the location of each individual study contributing to the evidence base was visualised on a world map.

One of the limitations of current risk-of-bias assessment practice in systematic reviews is that they are often divorced from the results to which they refer, and are infrequently incorporated into the analysis.[@marusic2020; @katikireddi2015] In response to this criticism, I developed a new visualisation tool which was designed to allow for the production of "paired" forest plots - [where]{.correction} a risk-of-bias assessment is presented alongside it's corresponding numerical result -  as recommended by the RoB 2 publication.[@sterne2019] In addition, the risk of bias due to missing evidence in each synthesis, as assessed using the RoB-ME tool, is shown beside the overall summary diamond. This tool was developed as an adjunct to this thesis,[@mcguinness2020robvisPaper] and summary of its functionality can be found in Appendix \@ref(appendix-robvis). Unless otherwise stated in the figure, only a single effect measure (hazard ratio/odds ratio/etc.) is represented in a given forest plot.

&nbsp;
<!----------------------------------------------------------------------------->

#### Assessment of added value of including preprints

Preprints are considered a valuable evidence source within this thesis (see Section \@ref(diverse-sources-preprints)). As an adjunct analysis to this review, meta-analyses including a preprint result were re-analysed using a fixed effect model. The weight assigned to the preprinted result was extracted and used to assess the additional evidential value provided to the synthesis by the preprint. In addition, I assessed the impact of including the preprinted results on the summary estimate, using a leave-one-out approach.

Finally, I performed a follow-up analysis, allowing for a two-year lag, to investigate whether included preprints had been subsequently published (in which case preprints provided a snapshot into the future, and a systematic review update would capture the published version of the preprint) or not (in which case preprints provided a distinct evidence source to conventional bibliographic databases).

&nbsp;
<!----------------------------------------------------------------------------->

## Summary

* In this chapter, I have presented the methods underpinning a comprehensive systematic review of the existing literature on the association of dementia incidence with blood lipid levels and lipid-regulating agents.

* I have detailed how this review found, extracted, critically assessed and synthesized the results of existing studies. Additionally, I have highlighted how this review examined the impact of including preprinted results, as enabled by the research tool presented in the previous chapter.

* In the following chapter, I present the results of this review, and detail how the evidence identified is used throughout the remainder of the thesis.