Radiation Therapy in the German Hodgkin Study Group HD 16 and HD 17 Trials: Quality Assurance and Dosimetric Analysis for Hodgkin Lymphoma in the Modern Era

Purpose Radiation therapy (RT) is an integral part of treatment concepts for early-stage Hodgkin lymphoma. This analysis reports on RT quality in the recent HD16 and 17 trials of the German Hodgkin Study Group (GHSG). Methods and Materials All RT plans of involved-node radiation therapy (INRT) in HD 17 were requested for analysis, along with 100 and 50 involved-field radiation therapy (IFRT) plans in HD 16 and 17, respectively. A structured assessment regarding field design and protocol adherence was performed by the reference radiation oncology panel of the GHSG. Results Overall, 100 (HD 16) and 176 (HD 17) patients were eligible for analysis. In HD 16, 84% of RT series were evaluated as correct, with significant improvement compared with the predecessor studies (P < .001). In HD 17, 76.1% of INRT cases revealed a correct RT design compared with 69.0% of IFRT-cases, which was superior to previous studies (P < .001). Comparing INRT and IFRT, we found no significant differences in the percentage of any deviation (P = .418) or major deviations (P = .466). Regarding dosimetry, INRT was accompanied by an improvement in thyroid doses. Comparing different RT techniques, we found that intensity-modulated RT showed a reduction of high doses in the lung at the expense of an increased low-dose exposure in HD 17. Conclusions The latest study generation of the GHSG demonstrates an improved quality in RT. A modern INRT design could be established without deterioration in quality. On a conceptual level, an individual consideration of the appropriate RT technique has to be performed.


Introduction
Radiation therapy (RT) has been an integral part of different treatment schedules for Hodgkin lymphoma (HL) since the introduction of large-field techniques in the second half of the 20th century. 1 Consequently, the technical evolution of RT, the implementation of systemic chemo-and immunotherapy, and the deepening of understanding concerning the disease's biology has led to a continuous decrease both in RT field size and dose. 2,3 Today, a combined modality approach is applied using multiagent chemotherapy regimens like ABVD (doxorubicin, bleomycin, vinblastine, dacarbazine) or BEACOPP (bleomycin, etoposide, doxorubicin, cyclophosphamide, vincristine, procarbazine and prednisone), followed by consolidative RT. 4,5 Involved-node radiation therapy (INRT) or involved-site radiation therapy (ISRT) is considered state of the art for first-line treatment with a dose of 20 to 30 Gy adapted to disease stage and risk category. [4][5][6] Recent randomized trials questioned the role of RT by introducing chemotherapy-only treatment concepts to early-favorable and early-unfavorable/intermediate-stage HL, respectively, but failed to demonstrate therapeutic equivalency of the monomodal treatment. 7,8 The HD 16 and HD 17 trials represent the latest generation of phase 3 trials launched by the German Hodgkin Study Group (GHSG) addressing early-or intermediatestage HL, respectively. 9,10 In the HD 16 trial, patients with HL in an early-favorable stage were randomized after 2 cycles of ABVD chemotherapy between a standard arm of 20 Gy of involved-field radiation therapy (IFRT) and a positron emission tomography (PET)−adapted arm, in which RT was limited to PET-positive cases 9 The HD 17 trial randomized patients with HL in intermediate/earlyunfavorable stage after upfront chemotherapy of 2 cycles of escalated BEACOPP and 2 cycles of ABVD between a standard treatment (30 Gy IFRT) and an experimental approach. 10 The latter stratified patients according to PET status, with an additional 30 Gy INRT in cases of PET positivity and no further treatment for complete metabolic remissions. These modern treatment strategies account for the biology of HL on an individual level and direct RT to high-risk patients. However, the quality of RT planning and treatment setup in these trials have not been studied in detail. Previous analyses on quality assurance have illustrated the importance of a systematic assessment of RT plans leading to harmonization and improvement of treatment concepts. [11][12][13][14] The aim of the present analysis is to continue this work and to examine whether the high standards of the GHSG could be maintained. One essential question was whether the transfer from IFRT to INRT within HD 17 was successful. In addition, information on organs-at-risk delineation and dosimetric data are provided.

Methods and Materials
Study concepts HD 16 The phase 3 international HD 16 study (NCT00736320; EudraCT code: 2007-004474-24) included more than 1100 patients with newly diagnosed HL aged between 18 and 75 years in Ann Arbor stage I or II without any risk factor (mediastinal bulk, extranodal involvement, elevated erythrocyte sedimentation rate, 3 or more involved nodal areas). Treatment consisted of 2 cycles of ABVD followed by a fluorodeoxyglucose−PET scan. In the standard arm, all patients received 20 Gy IFRT, whereas patients in the experimental arm had RT only in case of a PET-positive result (Deauville score ≥3). Results of the main trial showed a deterioration of progression-free survival (PFS; 86.1% vs 93.4% at 5 years) with the omission of RT. 9

HD 17
The phase 3 HD 17 trial (NCT01356680; EudraCT code: 2007-005920-34) enrolled 1100 patients with earlystage unfavorable HL aged between 18 and 60 years in Ann Arbor stage I-IIA with any combination of the aforementioned risk factors. In addition, patients in Ann Arbor stage IIB with an elevated erythrocyte sedimentation rate and/or at least 3 nodal areas were included. Patients underwent 2 cycles of escalated BEACOPP and 2 cycles of ABVD and subsequent 30 Gy IFRT in the standard arm. In the experimental arm, RT was administered to patients with PET-positive results after chemotherapy as 30 Gy INRT. This PET-stratified approach proved to be noninferior regarding PFS (5-

RT quality analysis
RT quality analysis was an integral part of the study protocol and covered by the respective institutional review board consents. The study goal was to analyze all patients with INRT treatment in HD 17. With IFRT being an established concept, only a randomized selection of patients undergoing this treatment were taken into analysis. In HD 16, 100 patients were analyzed.

Panel evaluation
All available RT series were assessed by the radiation oncology reference panel of the GHSG, the members of which are authors of the presented paper. To account for different opinions, at least 3 head of departments had to be present at each meeting. The evaluation process included the initial (pre-chemotherapy) imaging, RT plans, as well as recommendation from one of the reference radiation oncology institutions. During the structured process, target volume definition, RT techniques and setup, (fraction and total) doses, adherence to the reference RT recommendation, and correct execution of the plan were taken into account. Evaluation results were graded as "no deviation," "minor deviation," or "major deviation," respectively, according to the study protocol. A "major deviation" describes relevant and severe violations of the study protocol that may endanger patient safety or treatment efficacy, for instance, too narrow field delineations in an involved region or dose deviations greater than 10% (Fig. 1). In comparison, "minor deviation" is used for any (negligible) difference from the study protocol not fulfilling the criteria of a major event. Evaluations were discussed in a multilevel process until a consensus was reached. First, panel rounds were held as meetings at Muenster, Germany. With the advent of the COVID-19 pandemic, a transfer to digital conferences had to be performed.

Dosimetric analysis
Dose-volume histograms of the RT plans were analyzed for dosimetric information either as a paper-based evaluation or digitally. Subsequently, organs at risk (OAR) as well as dosimetric parameters (D mean or D max , respectively) were registered and compared between patients receiving IFRT and INRT and different treatment techniques. For supradiaphragmatic RT fields, dose values for the spinal cord, left and right lung, esophagus, heart, parotid glands, thyroid glands, female breast, and coronary vessels were evaluated, whereas the analysis for infradiaphragmatic RT field encompassed the spinal cord, kidneys, bowel, and gonads.

Statistical analysis
Continuous variables are summarized by the minimum, median, and maximum values, compared with categorical variables being presented as absolute numbers or relative frequencies. Normal distributions were assessed using a Shapiro−Wilk test. According to the presence of normal distribution, a 2-sample t-test or a Mann−Whitney U test was used to compare mean or median values or ranks, respectively. For these analyses, exact significances were considered. Distributions were compared using a Kolmogorov−Smirnov test. A x 2 test was used for comparisons between different evaluation results testing for 2sided exact significance. In all cases, a P value < .05 was considered to be statistically significant. All statistical analyses were carried out using Microsoft Excel (Microsoft, Redmond, WA) and SPSS version 28 (IBM, Armonk, NY).

Results
Panel evaluation HD 16 Overall, 100 patients undergoing IFRT were analyzed. Radiation doses were adequate with a median of 20 Gy (19.8-21.6 Gy) in normofractionation. In the majority of radiation series, supradiaphragmatic target volumes were treated (91%), with the regions most commonly irradiated being supraclavicular left/right (73% each), infraclavicular left/right (73% each), and cervical left/right (44% and 45%, respectively). According to the RT panel, 84% of cases were evaluated as "correct," 5% as "minor deviations," and 11% as "major deviations." Major deviations were caused by insufficient dose coverage of involved regions (11/11), predominantly in the upper mediastinum (5/11 cases; Table 1). Previous GHSG studies in earlystage HL (HD 10 12 and HD 13 11 ) showed a lower number of RT series executed according to protocol (38.8% and 52%, respectively). A x 2 test was used to compare the number of correct RT series in the different GHSG study generations (HD 13 vs HD 16). No expected cell frequencies were less than 5. Results reveal a significant improvement in favor of HD 16 (x 2 [1] = 33.807, P < .001, ' = −0.247). A similar improvement was found concerning the absence of major deviations in relation to the GHSG study generation (x 2 [1] = 27.378, P < .001, '= −0.222).

HD 17
In total, 176 patients (INRT: 134, IFRT: 42) were analyzed and treated with a median RT dose of 30 Gy (IFRT: 18-30.6 Gy, INRT: 14-40 Gy). Overall, 76.1% of INRT cases showed no deviation compared with 69.0% of IFRT cases. Deviations were reported for 9.7% and 14.2% of patients in the INRT group compared with 11.9% and 19.0% in the IFRT group for minor and major deviations, respectively. There was no significant difference between both cohorts regarding the percentage of plans with deviations (P = .418) or the percentage of major deviations (P = .466). The principal causes for major deviations were too narrow target volumes in the involved region (IFRT: 6 vs INRT: 17) or incorrect RT doses (IFRT: 1, INRT: 2; Table 1). In comparison with the HD 11 and HD 14 trial, there was a continuous increase in the percentages of RT series performed according to protocol (33.0% in HD 11 vs 37.8% in HD 14 vs 74.4% in HD 17 [pooled data for INRT and IFRT]). Again, a x 2 test was applied for comparisons between the different GHSG study generations (HD 14 vs HD 17). No expected cell frequencies were less than 5. The results underlined a significant improvement in favor of the recent study generation (x 2 [1] = 71.045, P < .001, ' = −0.310). Similar outcomes were found for the association of GHSG study generation and the absence of major deviations (x 2 [1] = 75.232, P < .001. ' = −0.319).

Comparison between INRT and IFRT
The study design of HD17 enabled a dosimetric comparison between INRT and IFRT ( Table 2). For most  Abbreviations: IFRT = involved-field radiation therapy; INRT = involved-node radiation therapy; PTV = planning target volume; RT = radiation therapy. RT plans were evaluated as "correct/no deviation," "minor deviation," or "major deviation," respectively (see text for further details). Percentages of the respective categories with absolute numbers given in parentheses. *In the INRT arm of HD17, there was one case with excessive beam energy and a too large PTV, thus subcategories do not add up to 9.7%. .043 Lung right D mean 9.

RT technique
In HD 16, RT was executed as 3-dimensional conformal RT (3D-CRT) in 76 patients, intensity-modulated RT (IMRT; including volumetric modulated arc therapy and tomotherapy) in 18 patients, or a combination of modalities (2, with 4 being unknown). In HD 17, numbers were 28 and 67 for 3D-CRT compared with 13 and 66 cases of IMRT treatment in the IFRT and INRT group, respectively. One case in the IFRT arm had an unknown treatment and 1 case in the INRT underwent a combined treatment.
Comparing the conventional RT techniques with modern IMRT revealed different dose exposures. For patients in HD 16, lung volumes exposed to 20 Gy were reduced with IMRT (median right: 1.6 vs 0.0 Gy, P = .014; median left: 3.3 vs 0.0 Gy, P = .006; Table 3). For the spinal cord, distribution of values differed between the 2 groups (P < .05 in the Kolmogorov−Smirnov test), and significant difference could be found (difference 3D-CRT: mean rank 39.33 vs IMRT: 15.7, P < .01, U = 102.0, Z = −3.356). In addition, technical comparisons between 3D-CRT and IMRT in HD 17 elaborated a significant increase in V 5 and V 10 but concomitant decrease of V 20 , V 25 and V 30 with the use of IMRT (Table 4).

Organs at risk delineation
Information on supradiaphragmatic OAR could be retrieved for 79 patients in HD 16 and 146 patients in HD 17 and revealed heterogeneity in contouring ( Table 5). Rates of contouring greater than 90% were found for the spinal cord, whereas numbers dropped considerably for breast tissue (3.8%-21.6% for left and right breast in HD16 and the 2 arms of HD17, respectively). There was no case in which the coronary vessels were contoured.
With the majority of patients undergoing supradiaphragmatic treatments, only sparse information on infradiaphragmatic dose exposure were available. Overall, there were 11 patients in HD 17 (3 IFRT and 8 INRT)      Abbreviations: IFRT = involved-field radiation therapy; INRT = involved-node radiation therapy. Percentages and absolute numbers (in parentheses) for the contouring and dose availability of the respective organ.
Advances in Radiation Oncology: May−June 2023 Quality assurance of RT in HD 16/17 and 9 patients in HD 16 with involvement below the diaphragm. Most commonly, the spinal cord and both kidneys were contoured (100% in HD 17, but only 11.1% or 22.2% in HD 16), but only a minority of cases (HD 16: 0%, HD 17: 20%) revealed bowel or gonads.

Toxicity
In both studies, acute toxicities during RT were mild to moderate, with only 3 cases of grade 3 toxicities in HD 16 (1 nausea or vomiting, 1 dysphagia, 1 mucositis). In HD 17, 3 cases of dysphagia occurred in the INRT and IFRT arms, respectively. One of these patients in the IFRT cohort had additional grade 3 mucositis and one additional patient in the INRT arm suffered from grade 3 to 4 leukopenia.

Discussion
The hereby presented analysis demonstrates a high quality of RT for HL in the modern era. It is the first to document and analyze the paradigm shift from IFRT to INRT revealing constancy in quality. Furthermore, it illustrates the challenges to comply with restrictive dose constraints in the modern era.
In the past decades, the use of RT in stage I and II has declined significantly according to an analysis from the Surveillance, Epidemiology, and End Results database (62.9% in 1988-1991 vs 43.7% in 2004-2006; P < .001). 15 The British Rapid Trial, the European H10 trial and the GHSG HD 16 trials attempted to omit RT in the treatment schedule but could not demonstrate noninferiority of the chemotherapy-only regimens. [7][8][9] The experimental arms of these trials were driven by the idea to maintain therapeutic equivalency but to reduce long-term toxicity, as secondary malignancies and cardiovascular diseases are the predominant mortality risks in the second decade after lymphoma treatment. 16,17 Although both infra-and supradiaphragmatic RT are described as risk factors for secondary cancers, no decrease in their incidence could be observed in the modern era. [17][18][19] In contrast, at least one study points toward a decrease in 25-year cardiovascular treatment mortality in the recent treatment period (4.3% for patients treated during 1989-2000 vs 5.7% for treatment in 1965-1976). 19 Data from modern ISRT and INRT concepts are still lacking. Moreover, most studies outline no decisive analysis of RT field designs and rather present a dichotomous stratification, analyzing the presence or absence of RT.
The HD 17 trial is the only randomized trial in which both an IFRT and an INRT concept are present in the study arms. Information on accurate target volume delineation was provided in the study protocol and specified in accompanying publications. 20 Surprisingly, the introduction of the modern and smaller INRT treatment did not result in a significant dosimetric reduction for most OAR (Table 2). One reason for this may be the rather conservative clinical target volume to PTV margin used in HD 17, with a 2-cm margin in axial dimension and 3 cm for cranial−caudal expansion. 20 In contrast, the European Organisation for Research and Treatment of Cancer used a 1-cm margin for their definition of INRT 21 and outlined a successful reduction of recurrences in previously involved areas in their H10 trial. 7 In the favorable and unfavorable arm of this study, only none and 5 recurrences in an initially involved lymph node area occurred after a combined treatment of chemo-and radiotherapy, respectively. 7 Regarding RT techniques, our analysis demonstrates the advantage of IMRT in avoiding high doses in organs like the lung, spinal cord, or thyroid at the expense of a greater low-dose exposure (Table 3 and 4). This finding has been reported previously and may contribute to a greater risk of secondary malignancies in the long-term follow-up. 22,23 Therefore, a careful and individual risk −benefit analysis has to be performed without general recommendation of a uniform RT strategy as reflected by modern guidelines. 24 Historically, quality assurance by the reference radiology and radiation oncology panel of the GHSG has proven to enhance both accuracy in diagnostic and therapy for HL. 11,12,14 The main reason for major deviations still lie within an inadequate coverage of an involved region which is in accordance with the literature. 11,12 A detailed recurrence analysis of the HD 16 trial showed that infield relapses constitute the major pattern of failures if RT is omitted in the treatment. 25 On the contrary, adequate IFRT was able to reduce infield failures from 8.7% to 2.1%. 25 Importantly, inaccurate target volume coverage in HL has been identified as a risk factor for relapse. In an analysis by Kinzie et al, 26 incorrect field margins resulted in a recurrence rate of 50% in comparison with 15% in case of a correct setup. 26 Likewise, relevant protocol violations of RT in the GHSG HD 4 trial led to a decline in relapse-free survival (72% vs 84% at 7 years, P = .0043). 13 However, in comparison with the predecessor studies, the major deviation rates in RT planning in HD 16/17 were lower (11% and 16.7% for HD 16/17 vs 36.8% and 42.5% for HD 10/11 vs 36% and 45% for HD 13/14), illustrating a learning curve for the delineation of IFRT. 11,12 This development was enabled by a continuous educational effort conducted by our group, which includes workshops but also contouring sessions and refresher courses at the annual meeting of our national radiation oncology society. In this regard, the introduction of INRT was not accompanied by a deterioration in quality, as no significant differences to the IFRT arm could be found (P = .418 and P = .466 for overall and major deviations, respectively). Further protocol violations due to technical reasons, incorrect setup, or (fraction) dose were less prevalent in both arms. Surprisingly, there was a reduced rate of correct IFRT series in HD 17 compared with HD 16, the reasons of which may be only speculated upon. It is possible the study centers struggled to define adequate INRT fields being neither too large nor too narrow in a direct comparison with the more precise INRT.
Concerning OAR delineation, there was a considerable heterogeneity in contouring between the different organs with varying percentages. Although some OAR may have not been required in every case, for example, the parotid glands in case of a mediastinal involvement, some inconsistency remains, for instance, a contouring of both lungs while lacking the heart or female breasts. Particularly, there was no case of coronary artery contouring. For this OAR, a linear dose-side effect relationship has been described with a 7.4% increase of coronary heart disease for every additional Gy of mean heart dose . 27 Therefore, dose maxima in the coronary arteries have to be avoided, keeping the dose as low as reasonably possible. 28 Inadequate contouring may bias the dosimetric results presented here: only contoured organs can be accounted for in the RT planning process, and this may blur potential differences, for instance, in the comparison between IMRT and 3D-CRT. Therefore, systematic and detailed contouring should be a major focus for further educational activities.
Dose constraints have been reported by the International Lymphoma Radiation Oncology Group. 28 Ideal doses include a mean heart dose <5 Gy, a mean breast dose <4 Gy, a V 5 <55%, and V 20 <30% in the lung, a mean lung dose <10 Gy, as well as a V 25 of the thyroid <62.5%. When using 30 Gy in HD 17, some constraints were not met, for example, V 5 in the lungs or the mean heart dose, underlining the importance of a careful RT planning.
As a consequence of the low RT doses used in the protocols, grade 3 toxicities were rare, with only 11 events in both studies. Correspondingly, grade 3 or 4 toxicities were registered in a total of 3% to 26% of cases in the study arms of HD 16 and HD 17, respectively. 9,10 Focusing on RT toxicity, only 3.4% of patients in HD 16 suffered from grade 3 side effects, the majority being dysphagia (1.8%) and mucositis (0.9%). 25 The presented study reveals some limitations. Despite all efforts, it was not possible to obtain all INRT plans from HD 17 due to an incomplete response rate. Furthermore, the number of patients with infradiaphragmatic disease was limited, which prevented a decisive sub-analysis. The number of IFRT cases analyzed was intentionally limited, which may lead to false assessment of failure rates. However, as the respective percentages are both in line with HD 16 and the INRT cohort of HD 17, this suggestion is unlikely. In addition, a matched cohort analysis between INRT and IFRT could not be established due to limited patient numbers. Technically, the HD 16/17 trials were conducted during a period in which IMRT was not used routinely in many departments. Thus, advanced techniques like butterfly volumetric arc therapy were only applied in rare cases. 29 Future workshops may help to spread the knowledge on these techniques and enable a widespread application.
The results of HD 17 and the H10U by the European Organisation for Research and Treatment of Cancer established a risk-adapted treatment strategy for patients with intermediate-stage HL, 7,10 which demands an individualized and modern RT conceptualization. This evolution will continue: In the recent GHSG NIVAHL study, immunotherapy was introduced in the first-line treatment, which will alter treatment responses and probably RT design. 30 In the end, further long-term analyses will be needed to examine a possible correlation between RT quality and oncological outcomes in the context of INRT and ISRT. In the meantime, continuous educational activities are needed to maintain the high-quality of RT planning and execution demonstrated in this analysis.