Monday, 16 August 2010

Development of a clinical decision model for thyroid nodules

Alexander Stojadinovic,corresponding author1,3 George E Peoples,2,3 Steven K Libutti,4 Leonard R Henry,3,5 John Eberhardt,6 Robin S Howard,7 David Gur,8,9 Eric A Elster,5 and Aviram Nissan3,10
1Department of Surgery, Division of Surgical Oncology, Walter Reed Army Medical Center,Washington, D.C., USA
2Department of Surgery, Brooke Army Medical Center, Fort Sam Houston, TX, USA
3The United States Military Cancer Institute, Washington, D.C., USA
4Angiogenesis Section, Surgery Branch, National Cancer Institute, Bethesda, MD, USA
5Department of Surgery, National Naval Medical Center, Bethesda, MD, USA
6BioInformatics Division, DecisionQ, Washington, D.C., USA
7Department of Clinical Investigation, Division of Biostatistics, Walter Reed Army Medical Center, Washington, D.C., USA
8Department of Radiology, University of Pittsburgh, PA, USA
9Magee-Women's Hospital, Pittsburgh, PA, USA
10Department of Surgery, Hadassah-Hebrew University Medical Center, Mount Scopus, Jerusalem, Israel
corresponding authorCorresponding author.
Alexander Stojadinovic: alexander.stojadinovic@amedd.army.mil; George E Peoples: georgepeoples@hotmail.com; Steven K Libutti: slibutti@nih.gov; Leonard R Henry: leonard.henry@med.navy.mil; John Eberhardt: john.eberhardt@decisionq.com; Robin S Howard: robin.howard@amedd.army.mil; David Gur: gurd@upmc.edu; Eric A Elster: eric.elster@med.navy.mil; Aviram Nissan: anissan@hadassah.org.il

Abstract
 
Background
Thyroid nodules represent a common problem brought to medical attention. Four to seven percent of the United States adult population (10–18 million people) has a palpable thyroid nodule, however the majority (>95%) of thyroid nodules are benign. While, fine needle aspiration remains the most cost effective and accurate diagnostic tool for thyroid nodules in current practice, over 20% of patients undergoing FNA of a thyroid nodule have indeterminate cytology (follicular neoplasm) with associated malignancy risk prevalence of 20–30%. These patients require thyroid lobectomy/isthmusectomy purely for the purpose of attaining a definitive diagnosis. Given that the majority (70–80%) of these patients have benign surgical pathology, thyroidectomy in these patients is conducted principally with diagnostic intent. Clinical models predictive of malignancy risk are needed to support treatment decisions in patients with thyroid nodules in order to reduce morbidity associated with unnecessary diagnostic surgery.
Methods
Data were analyzed from a completed prospective cohort trial conducted over a 4-year period involving 216 patients with thyroid nodules undergoing ultrasound (US), electrical impedance scanning (EIS) and fine needle aspiration cytology (FNA) prior to thyroidectomy. A Bayesian model was designed to predict malignancy in thyroid nodules based on multivariate dependence relationships between independent covariates. Ten-fold cross-validation was performed to estimate classifier error wherein the data set was randomized into ten separate and unique train and test sets consisting of a training set (90% of records) and a test set (10% of records). A receiver-operating-characteristics (ROC) curve of these predictions and area under the curve (AUC) were calculated to determine model robustness for predicting malignancy in thyroid nodules.
Results
Thyroid nodule size, FNA cytology, US and EIS characteristics were highly predictive of malignancy. Cross validation of the model created with Bayesian Network Analysis effectively predicted malignancy [AUC = 0.88 (95%CI: 0.82–0.94)] in thyroid nodules. The positive and negative predictive values of the model are 83% (95%CI: 76%–91%) and 79% (95%CI: 72%–86%), respectively.
Conclusion
An integrated predictive decision model using Bayesian inference incorporating readily obtainable thyroid nodule measures is clinically relevant, as it effectively predicts malignancy in thyroid nodules. This model warrants further validation testing in prospective clinical trials.

Background
 
Thyroid nodules represent a common problem brought to medical attention. Four to seven percent of the United States adult population (10–18 million people) has a palpable thyroid nodule(s), and up to 50% of American women older than age 50 have nodules visible by ultrasound [1]. The majority (>95%) of thyroid nodules are benign; however, malignancy risk increases with male gender, nodule size, rapid growth and associated symptoms, extremes of age (< 30 and > 60 years), underlying autoimmune disease, nodule growth under thyroid hormone suppression, personal or family history of thyroid malignancy and radiation exposure [2].
Thorough history and physical examination, serum thyrotropin (TSH) level, thyroid ultrasound (US) and fine need aspiration (FNA) comprise the standard evaluation of patients with thyroid nodules. Patients with thyroid nodules typically undergo both thyroid US and FNA. Nodules with maximum diameter > 1.0–1.5 cm with solid elements, or nodules demonstrating suspicious features on US particularly should undergo FNA [3]. Given the increased risk of malignancy in so-called thyroid incidentalomas detected by 18FDG-PET (14–50%) or sestamibi scan (22–66%), FNA is indicated under these circumstances as well [4,5].
Fine needle aspiration remains the most cost effective and accurate diagnostic tool for thyroid nodules in current practice. Although a standard of practice, FNA remains an imperfect diagnostic test for thyroid nodules, particularly when one considers the high frequency (>20%) of indeterminate cytology. A six tier classification system for FNA is favored that is associated with increased risk of malignancy across the spectrum of unsatisfactory or non-diagnostic FNA (unknown), benign (<1%), follicular lesion (atypia) of undetermined significance (5–10%), follicular neoplasm (20–30%), suspicious for malignancy (50–75%), malignant (100%) [3]. In experienced hands, sensitivity and specificity are very high, 95% and 99%, respectively, but sensitivity and specificity of FNA varies considerably, as it is highly dependent on the operator as well as the cytologist's skills [6,7]. In studies where cytology was compared to histology or revised by an expert cytologist, inaccuracy of the initial diagnosis was observed in up to 61% of the cases [8]. Unfortunately over 20% of patients undergoing FNA of a thyroid nodule have indeterminate cytology (follicular neoplasm) with associated malignancy risk prevalence of 20–30%, and they require thyroid lobectomy/isthmusectomy purely for the purpose of attaining a definitive diagnosis. Given that the majority (70–80%) of patients with "follicular neoplasm" has benign surgical pathology, thyroidectomy in these patients is conducted principally with diagnostic intent [9].
This emphasizes the need for non-invasive diagnostic imaging modalities with improved cancer detection accuracy coupled with clinically-relevant, treatment-directing malignancy risk prediction models to assist the clinician in the interpretation of available diagnostic information and minimize the frequency of purely diagnostic thyroid resections. We have previously studied the potential value of electrical impedance scanning (EIS) of thyroid nodules in a prospective feasibility trial. The overall diagnostic accuracy (73%) of EIS in that study was clinically meaningful, as utilization of the technology could result in a significant reduction (67%) in the number of purely diagnostic thyroid resections for cytologically determined follicular neoplasm [10].
Bayesian Belief Networks or Bayesian classification has gained acceptance as a methodology for characterizing multi-dimensional or complex data sets pursuant to developing disease risk prediction models [11,12]. A Bayesian Belief Network (BBN) is a graphical model that represents variables and their probabilistic independencies. Clinical observations such as symptoms, imaging data and lab results may be encoded into a BBN in order to estimate the probability of a disease or disorder [13,14].
Advances in machine learning allow users to train these networks on complex clinical problems using an intuitive computer program [15]. A BBN encodes the joint probability distribution of all the variables in the data set by building a directed acyclic network of conditional probabilities incorporating independent predictor nodes (variables), each with its own prior probability [11,16]. Conditional independence statements are embedded in the network structure through the arcs that connect the network's nodes [15]. These network arcs between nodes define a hierarchy and structure of information. Bayesian networks allow clinicians to derive insights about the data domain because the networks are graphical, hierarchical representations of how conditionally independent variables associate to inform a dependent outcome of interest, such as presence of malignancy. The inferential structure of the network allows the clinician to collect a priori evidence of independent variables, add this knowledge to the network and receive a posteriori probability of outcome.
We hypothesized that a Bayesian Belief Network analytical tool could be constructed using a machine learning platform applied to this specific patient study population represented by relevant clinical variables (e.g. patient age, gender, thyroid nodule size, US and impedance characteristics, and FNA cytology) in order to develop a model-derived risk assessment tool, which could support decision making on the basis of individual patient risk of malignancy. We further hypothesized that co-dependent analysis of EIS in the context of standard testing (US, FNA) would increase the utility of all of these studies through clinical decision support. The primary focus of this analysis is to determine the feasibility of a Bayesian predictive model to assist the clinician in interpreting diagnostic information.

Methods
 
We trained a Bayesian classifier on a prospectively enrolled cohort [(n = 216; 110 with malignant thyroid nodules (51%)] collected over a four year period (Sept 2002 – Dec 2006) in the context of a previously published IRB-approved clinical trial including thyroid impedance, ultrasound imaging, cytological and histopathological outcome data [10,17]. This was a prospective single arm observational cohort trial evaluating the diagnostic accuracy of pre-operative thyroid EIS in patients scheduled to undergo thyroidectomy. Fifty percent of patients (n = 109) were undergoing diagnostic thyroidectomy for indeterminate FNA cytology. The objective was to train and validate a classifier that could be used for clinical decision making.
Thyroid EIS Examination
Thyroid EIS was performed as described previously [10]. Thyroid EIS was conducted prior to thyroid surgery using the T-Scan 2000ED [TransScan Medical (Mirabel®), Austin, TX].
Impedance recordings of conductivity and capacitance were obtained over the entire gland in a predetermined sequence using a real-time image acquisition technique over a broad frequency range (frequency range, 50–20,000 Hz). A gray-scale impedance map provided an anatomical image corresponding to the area of interest directed to a palpable or sonographic thyroid nodule.
Homogeneous gray scale impedance maps (uniform conductivity and capacitance) are characteristic of normal or benign thyroid nodules, which demonstrate similar conductivity and capacitance (or impedance) to normal thyroid tissue. A focal disturbance in electrical field distribution by a malignant tumor due to its increased conductivity and, or capacitance (or decreased impedance) appears as a focal bright white spot on the gray scale impedance map. Changes from baseline sternocleidomastoid conductivity and capacitance were calculated for the thyroid nodule(s). A positive EIS examination was previously defined as a focal bright spot over a thyroid nodule correlating with increased conductivity (decreased impedance) and/or capacitance >25% baseline sternocleidomastoid muscle impedance, absent confounding local artifact [10,17].
For the purpose of this study two surgical oncologists (AS, AN) with extensive experience with EIS in general, and the T-Scan 2000ED in particular, performed critical review of impedance scans conducted in the previous trial, both blinded to fine needle aspirate cytology and surgical pathology results. They determined an EIS level of suspicion (LOS) score on the basis of a focal white spot presence and increased conductivity and, or capacitance (with the previously established 25% above baseline impedance cutoff) associated with the palpable or sonographic thyroid abnormality. Thyroid nodule Level of Suspicion was classified as follows in the blinded review: LOS 1: Definitely benign; LOS 2: Highly unlikely to be malignant; LOS 3: Unlikely to be malignant; LOS 4: Likely to be malignant; and, LOS 5: Highly likely to be malignant. Thyroid nodules corresponding to a palpable or sonographic abnormality determined to have LOS of 4 or 5 were considered EIS-positive; otherwise they were regarded EIS-negative.
All study subjects underwent thyroid resection after thyroid US, FNA and EIS. Surgical histopathology was correlated with sonographic, cytological and impedance findings and interpretations.
FNA and surgical specimens were evaluated by experienced board-certified cytologists and thyroid pathologists who rendered cytological and histopathological diagnosis without knowledge of EIS level of suspicion for malignancy.
Statistical Analysis Plan Using Bayesian Belief Networks
Study data were collected and assembled into a data set consisting of 216 subjects, 109 with indeterminate cytology results. Biopsy results were classified based on established clinical guidelines into either Benign (n = 106) or Malignant (n = 110) diagnoses and assembled into a master data set. The master data set was then randomized into ten additional cross-validation sets. Each subject record was assigned a randomly generated number. These numbers were then used to assign the subjects to ten unique test groups. A unique training set consisting of the remaining 90% of cases was created for each test group.
Data analysis was then conducted using a Bayesian Belief Network (BBN). The BBN was built by applying a set of heuristics to generate predictive models with different conditional independence assumptions. The BBN we constructed encoded the joint probability distributions of all the variables in our clinical data set from our previously published clinical trial by building a network of conditional probabilities [17]. The BBN is a directed network incorporating parent-child relationships between nodes. The network was queried to provide estimates for posterior probabilities given a priori knowledge, and tested for accuracy using data withheld from the training model. The Bayesian network in this study was constructed using FasterAnalytics™, (DecisionQ, Washington, DC).
The network was validated using a train-and-test cross-validation methodology, in this instance ten-fold cross-validation. Cross-validation is an established technique in multivariate analysis which allows researchers to estimate the performance of predictive models when used outside of the research setting. This analysis calculates predictive values by classifying the outcome (surgical pathology diagnosis) for a given instance and comparing this prediction to the known value in an independent test set. The test set predictions were then used to calculate a receiver-operating characteristic (ROC) curve and inference matrix by threshold for each test set by clinical feature of interest.
The curve was calculated by comparing the predicted value for each feature of interest to the known value in the test set on a case-specific basis, rank-ordering the resulting predictions from most likely to least likely and calculating the curve using the assumption that the most likely cases would be evaluated first. This curve was then used to calculate area-under-the-curve (AUC), positive and negative predictive value (PPV and NPV). 
Results
 
Clinical, image-based, cytological and pathological characteristics of the population are demonstrated in Table 1. Importantly the disproportionately high prevalence of indeterminate cytology is reflective of a referral-based population for operation enrolled in surgical clinics. These clinical data were encoded in a Bayesian Belief Network (BBN) in order to estimate the probability of thyroid nodule histopathology. As only very few patients in this cohort had a family history of thyroid cancer or exposure to radiation we elected not to include these parameters in the BBN. Figure Figure11 shows the ROC curve and area under the curve for the model tested a posteriori against the master data set of 216 patients for cancer detection, 110 with malignancy.
Table 1
Table 1
Characteristics of the study population
Figure 1
Figure 1
ROC Curve for cancer prediction in validation against the master data. Sensitivity is plotted on the y-axis and 1-specificity is plotted on the x-axis.
The completed BBN was cross-validated using the training and test sets, and also tested a posteriori against the master data set to assess predictive power. Table 2 details the cross-validation results for each train-and-test pair and the a posteriori testing results. Cross validation of the model created with Bayesian Network Analysis effectively predicted malignancy [AUC = 0.88 (95% Confidence Interval (CI): 0.82–0.94)] in thyroid nodules. The positive and negative predictive values of the model are 83% (95%CI: 76%–91%) and 79% (95%CI: 72%–86%), respectively. Sensitivity and specificity of the model are 82% (95%CI: 74%–91%) and 77% (95%CI: 68%–86%), respectively, at a 50% threshold. In the cross-validation of the BBN model developed in this study, a 50% probability threshold (most likely case) for calling a case malignant or benign produced the highest results in this dataset (Table 2 and 3).
Table 2
Table 2
BBN cross-validation results for each train-and-test pair and a posteriori testing results
Table 3
Table 3
Contribution of first-order predictors (thyroid nodule size, US and EIS characteristics, and FNA cytology) to predictive power of the Bayesian model
The resulting BBN is a directed graph of conditional dependence between variables. Figure Figure22 shows the structure of the BBN developed in this study to predict final histopathology in 216 patients with thyroid nodules. What we learn from the structure of the BBN is that four variables share direct conditional dependence with final histopathology: fine needle aspiration (FNA) cytology, maximum nodule size (determined by ultrasound), electrical impedance scan (EIS) and ultrasound (US) characteristics of the nodule. The relative contribution of each of these four factors was determined by excluding each factor one at a time in a posteriori analysis against the master data set of 216 patients. The only factor that significantly degraded the model, when eliminated from the network, was thyroid nodule EIS (Table 3). The features directly, and conditionally dependent with final histopathology were nodule size, ultrasound and EIS characteristics, and FNA cytology of the thyroid nodule. Further, the variables patient age, thyroid nodule size, scintigraphic findings (hot, warm, cold), and EIS characteristics are also conditionally dependent with one another and through thyroid nodule size and EIS characteristics inform final histopathology.
Figure 2
Figure 2
Bayesian Belief Network model: Pathological diagnosis (Overall Pathology Dx) in thyroid nodules (Benign versus Malignant). The model structure defines four critical predictors of thyroid nodule histopathology (red circles): fine needle aspiration (FNA) (more ...)
With a trained, tested, and cross validated model, the clinician can add evidence to the model given prior knowledge of a specific case through the selection of specific features and generate case-specific predictions of final histopathology. The final pathology diagnosis for a given patient with thyroid nodule EIS level of suspicion of 2 (highly unlikely to be malignant) has a posterior probability of cancer of 19%. Adding thyroid nodule ultrasound finding of 'solid' to the EIS level of suspicion of 2 refines the case specific posterior estimate of malignancy to 23%, which is less than the cancer rate in the study population. Additional data refines the prediction of malignancy even further; indeterminate FNA cytology of a 'solid' nodule by US, having EIS level of suspicion of 2 has a posterior probability of benignity of 85% (15% probability of malignancy). Changing the EIS result from highly unlikely to be malignant (LOS 2) to level of suspicion of 4 (likely to be malignant) increases the posterior probability of malignancy from 15% to 65% (Figure (Figure33).
Figure 3
Figure 3
Posterior estimate of surgical pathology outcome derived from prior knowledge of EIS result (EIS level of suspicion of 4 – likely to be malignant), ultrasound finding of solid thyroid nodule, and indeterminate FNA cytology. Changing the EIS result (more ...)
Inference-based individual case-specific estimates of posterior probability from the Bayesian Belief Network can also be developed by applying the model to new data sets in either batch inference mode or by tabulating all potential combinations in an inference table. Table 4 provides an example of an inference table calculated using the model developed in this study for all potential combinations of EIS and FNA cytology result, providing the clinician with a simple "look-up table" format which may be easier to interact with than the model. For example, the sixth case in Table 4, Definitely Benign EIS (Level of Suspicion of 1) and Indeterminate FNA cytology, has a probability of cancer of 5.7%. However, a patient with an indeterminate nodule with EIS Level of Suspicion of 4 (likely to be malignant) has a 58.7% probability of thyroid malignancy.
Table 4
Table 4
Inference table calculated using the model developed in this study for all potential combinations of EIS and FNA result, selected subset.
 
Discussion
 
The primary aim of this study was to develop a Bayesian Belief Network model based on data collected prospectively in the context of a clinical trial evaluating the feasibility of electrical impedance scanning in patients with thyroid nodules pre-determined to undergo thyroid resection. Relevant clinical variables were included in the model in order to develop a model-driven risk assessment tool, which could support decision making on the basis of individual patient risk of malignancy. The model created with Bayesian Network Analysis effectively predicted malignancy [AUC = 0.88 (95%CI: 0.82–0.94)] in thyroid nodules. The positive and negative predictive values of the model are 83% (95%CI: 76%–91%) and 79% (95%CI: 72%–86%), respectively.
The thyroid nodule is a prevalent clinical problem in the United States, and the majority of nodules are pathologically benign. The increasingly frequent use of sensitive diagnostic modalities has contributed to an unprecedented rise in the incidence of differentiated thyroid carcinoma [18]. The preponderance of identified papillary thyroid cancer is small sub-clinical, indolent disease; hence, the challenge to the clinician is differentiating tumors of favorable biology from those with notoriously aggressive behavior [18]. Although clinical indicators of malignancy risk can facilitate therapeutic decision making, they are imperfect in directing treatment for those patients most likely to benefit from thyroidectomy. Another vexing, more fundamental problem than defining biology of malignancy, is that of definitively diagnosing the cytologically indeterminate thyroid nodule. This often necessitates diagnostic operation in a large proportion of patients with so-called "follicular neoplasms", to benefit possibly the few patients that actually have thyroid malignancy. Accurately predicting malignancy in any given thyroid nodule remains a daunting clinical challenge, establishing the need for decision support tools or predictive models to guide therapeutic decision making. The present study implemented Bayesian classification on a prospectively enrolled clinical trial cohort including clinical, image-based as well as cytological predictors of malignancy. A clinically relevant prognostic risk assessment tool was constructed and cross validated, which provides individual patient-specific prediction of malignancy in thyroid nodules.
Bayesian classification has been applied across the spectrum of medicine [19,20] from optimization of pharmacotherapy dosing [21,22], predicting cancer screening [23] and diagnostic test results [24,25], to determining injury severity [26] and ICU mortality [27], assessing operative risk [28] and predicting surgical outcomes [29-32]. More recently, BBN models have been developed to predict cancer-specific outcomes [33-37]. The findings of the current study demonstrate that the BBN model provides an individualized estimate of cancer risk in thyroid nodules, in three clinically relevant categories of FNA cytology category: negative, positive, and indeterminate. The receiver operating characteristic curves can be used to optimize the model for negative and positive predictive value in our thyroid cohort. Importantly, a patient in our broad population with an indeterminate nodule with EIS Level of Suspicion of 4 or 5 (likely or highly likely to be malignant) has a 58.7% and 73.6% probability of thyroid malignancy, respectively, according to the prognostic risk assessment tool developed and cross-validated herein. The predictive model developed in this study not only provides an individualized estimate of risk of malignancy in patients with a broad spectrum of thyroid nodules, it also can support integration with clinical systems (electronic health record) and provide real time estimates of risk, thereby facilitating clinical decision making and patient education. The iterative nature of the modeling methodology permits addition of new data, which can be used to update, or re-train and validate, dynamically modify and optimize the model. Model optimization with new data input over time will be important, as patients with indeterminate FNA cytology and EIS level of suspicion ranging from 1–3 (Normal to Unlikely to be malignant) in the current model have a clinically meaningful (~10%) likelihood of malignant histopathology.
There are several limitations inherent in the prognostic risk assessment tool constructed in this study. Other clinical data such as 18F-FDG Positron Emission Tomography, Doppler ultrasound, Magnetic Resonance Imaging and quantitative RT-PCR assays for thyroid-cancer-related genes of fine needle aspirates, which were not tested in this clinical trial, may be relevant and could improve the predictive value of the model.
Ultrasound variables considered in the BBN model development included primary thyroid nodule characteristics (e.g. solid versus cystic) and maximum dimension; however, other sonographic variables not measured in the study could have incremental predictive value, including color Doppler ultrasound-directed qualitative intra-nodular vascular distribution and microcalcifications, as well as quantitative analysis of tumor vascularity (tumor vascular resistive index). Although elimination of thyroid nodule impedance characteristics from the network significantly degraded the model in this study, thyroid impedance remains investigational and warrants further clinical validation. Further, while the predictive model was cross validated to assess robustness, it remains to be independently and prospectively validated in a new and expanded diverse patient population with thyroid nodules. This will be particularly important recognizing another putative factor limiting the generalizability of our study results – the selected pre-operative, disease-enriched population. The increased prevalence of disease biases the estimates of the positive predictive value (overly optimistic) and negative predictive (overly pessimistic). The ultimate value of the model will rest in its ability to predict malignancy in a general population of patients with thyroid nodules, where the prevalence of malignancy is decidedly lower. Importantly, we anticipate that the validated model will be utilized in situations of clinical uncertainty after standard testing (US and FNA) in order to facilitate clinical decision making with respect to operative indication.
Recognizing that individual variables, though independently associated with thyroid cancer, are insufficient in predicting of risk of malignancy in any given thyroid nodule, other investigators have stressed the importance of developing multivariate predictive algorithms to determine cumulative risk of malignancy for this common clinical problem [38,39]. Raza et al. utilized a multivariate stepwise regression model to predict malignancy in thyroid nodules in a highly selected patient population on the basis of patient age, calcifications in a sonographically solid nodule, and FNA cytology [39]. Tuttle, Lamar and Burch applied multivariate modeling in patients with indeterminate thyroid nodules to define male gender, nodule size exceeding 4 cm, and character of the gland by palpation (dominant nodule in multi-nodular goiter) to predict risk of thyroid malignancy [38]. Their analysis was limited to a narrow population of patients with follicular neoplasia by FNA, and did not include any imaging-based variables in the predictive model.

Conclusion
 
Our study is in agreement with these investigations in that it suggests that a broad statistically validated network structure of multiple clinical variables has the potential to provide a universal method to individualize patient care. The dynamic, quantitative case-specific predictions made by this type of a predictive model could allow clinical decision support tools to be adapted to the specific needs and capabilities of a given medical clinic. This preliminary yet promising clinical tool clearly warrants further validation testing in planned prospective trials. If prospective validation of the model is successful we anticipate the model to serve as a web-based clinical tool, which can be accessed by physicians, and utilized by them in order to evaluate the risk of malignancy in individual patients presenting with thyroid nodule(s).
Abbreviations
BBN: Bayesian Belief Network; EIS: Electrical Impedance Scanning; ICU: Intensive Care Unit; F-FDG PET: Fluorodeoxyglucose Positron Emission Tomography; FNA: Fine Needle Aspiration; NPV: Negative Predictive Value; PPV: Positive Predictive Value; ROC: Receiver Operating Characteristics; RT-PCR: Real Time Polymerase Chain Reaction; TSH: Thyrotropin.
Competing interests
The authors declare that they have no competing interests.

Authors' contributions
 
All authors read and approved the final manuscript
AS concived and designed the project, aquired, analysed and interpreted the data with statistical expertise, drafted and made critical revisions to the manuscript, obtained funding and supervied the overall project.
GEP aquired the data, made crittical revisions to the manuscript, and supervised the project.
SKL analysed and interpreted the data, made critical revisions to the manuscript and supervised the project.
LRH made crittical revisions to the manuscript.
JE concieved and designed the project, analysed and interpreted the data with statistical expertise, drafted and made critical revisions to the manuscript.
RSH analysed and interpreted the data with statistical expertise.
DG obtained funding, supervised the project and made crittical revisions to the manuscript.
EAE made critical revisions to the manuscript.
AN aquired the data, drafted the manuscript, made critical revisions to the manuscript, obtained funding, and supervised the project.
Pre-publication history
The pre-publication history for this paper can be accessed here:
 
Acknowledgements
 
The opinions or assertions contained herein are the private views of the authors and are not to be construed as official or reflecting the views of the Department of the Army, Department of the Navy, or the Department of Defense.
This study was support by the Department of Surgery and Department of Clinical Investigation, Walter Reed Army Medical Center.
We owe a debt of gratitude to our patients who made this study possible. This work was supported through the tireless efforts of our research program manager, Mrs. Tiffany Felix.
 
BMC Surg. 2009; 9: 12.
Published online 2009 August 10. doi: 10.1186/1471-2482-9-12.
PMCID: PMC2731077

Neonatal Thyroid Function in Seveso 25 Years after Maternal Exposure to Dioxin

Andrea Baccarelli,1,2,3* Sara M Giacomini,2,3 Carlo Corbetta,4 Maria Teresa Landi,5 Matteo Bonzini,2,3 Dario Consonni,2,3 Paolo Grillo,2,3 Donald G Patterson, Jr.,6 Angela C Pesatori,2,3 and Pier Alberto Bertazzi2,3
1 Department of Environmental Health, Harvard School of Public Health, Boston, Massachusetts, United States of America
2 Department of Occupational and Environmental Health, Clinica del Lavoro “L. Devoto,” University of Milan, Milan, Italy
3 Department of Preventive Medicine, Istituto di Ricovero e Cura a Carattere Scientifico (IRCCS) Ospedale Maggiore Policlinico, Mangiagalli, Regina Elena Foundation, Milan, Italy
4 Neonatal Screening Laboratory, “V. Buzzi” Children's Hospital, Milan, Italy
5 Genetic Epidemiology Branch, Division of Cancer Epidemiology and Genetics, National Cancer Institute, National Institutes of Health/Department of Health and Human Services (NIH/DHHS), Bethesda, Maryland, United States of America
6 EnviroSolutions Consulting, Jasper, Georgia, United States of America
Bruce Lanphear, Academic Editor
University of Cincinnati, United States of America
* To whom correspondence should be addressed. E-mail: abaccare@hsph.harvard.edu

Abstract
Background
Neonatal hypothyroidism has been associated in animal models with maternal exposure to several environmental contaminants; however, evidence for such an association in humans is inconsistent. We evaluated whether maternal exposure to 2,3,7,8-Tetrachlorodibenzo-p-dioxin (TCDD), a persistent and widespread toxic environmental contaminant, is associated with modified neonatal thyroid function in a large, highly exposed population in Seveso, Italy.
Methods and Findings
Between 1994 and 2005, in individuals exposed to TCDD after the 1976 Seveso accident we conducted: (i) a residence-based population study on 1,014 children born to the 1,772 women of reproductive age in the most contaminated zones (A, very high contamination; B, high contamination), and 1,772 age-matched women from the surrounding noncontaminated area (reference); (ii) a biomarker study on 51 mother–child pairs for whom recent maternal plasma dioxin measurements were available. Neonatal blood thyroid-stimulating hormone (b-TSH) was measured on all children. We performed crude and multivariate analyses adjusting for gender, birth weight, birth order, maternal age, hospital, and type of delivery. Mean neonatal b-TSH was 0.98 μU/ml (95% confidence interval [CI] 0.90–1.08) in the reference area (n = 533), 1.35 μU/ml (95% CI 1.22–1.49) in zone B (n = 425), and 1.66 μU/ml (95% CI 1.19–2.31) in zone A (n = 56) (p < 0.001). The proportion of children with b-TSH > 5 μU/ml was 2.8% in the reference area, 4.9% in zone B, and 16.1% in zone A (p < 0.001). Neonatal b-TSH was correlated with current maternal plasma TCDD (n = 51, β = 0.47, p < 0.001) and plasma toxic equivalents of coplanar dioxin-like compounds (n = 51, β = 0.45, p = 0.005).
Conclusions
Our data indicate that environmental contaminants such as dioxins have a long-lasting capability to modify neonatal thyroid function after the initial exposure.

Editors' Summary
 
Background.
 
The thyroid, a butterfly-shaped gland in the neck, controls the speed at which the human body converts food into the energy and chemicals needed for life. In healthy people, the thyroid makes and releases two hormones (chemical messengers that travel around the body and regulate the activity of specific cells) called thyroxine (T4) and triiodothyronine (T3). The release of T4 and T3 is controlled by thyroid secreting hormone (TSH), which is made by the pituitary gland in response to electrical messages from the brain. If the thyroid stops making enough T4 and T3, a condition called hypothyroidism (an underactive thyroid) develops. Adults with hypothyroidism put on weight, feel the cold, and are often tired; children with hypothyroidism may also have poor growth and mental development. Because even a small reduction in thyroid hormone levels increases TSH production by the pituitary, hypothyroidism is often diagnosed by measuring the amount of TSH in the blood; it is treated with daily doses of the synthetic thyroid hormone levothyroxine.
Why Was This Study Done?
Although hypothyroidism is most common in ageing women, newborn babies sometimes have hypothyroidism. If untreated, “neonatal” hyperthyroidism can cause severe mental and physical retardation so, in many countries, blood TSH levels are measured soon after birth. That way, levothyroxine treatment can be started before thyroid hormone deficiency permanently damages the baby's developing body and brain. But what causes neonatal hypothyroidism? Animal experiments (and some but not all studies in people) suggest that maternal exposure to toxic chemicals called dioxins may be one cause. Dioxins are byproducts of waste incineration that persist in the environment and that accumulate in people. In this study, the researchers investigate whether exposure to dioxin (this name refers to the most toxic of the dioxins—2,3,7,8-Tetrachlorodibenzo-p-dioxin) affects neonatal thyroid function by studying children born near Seveso, Italy between 1994 and 2005. An accident at a chemical factory in 1976 heavily contaminated the region around this town with dioxin and, even now, the local people have high amounts of dioxin in their bodies.
What Did the Researchers Do and Find?
The researchers identified 1,772 women of child-bearing age who were living very near the Seveso factory (the most highly contaminated area, zone A) or slightly further away where the contamination was less but still high (zone B) at the time of the accident or soon after. As controls, they selected 1,772 women living in the surrounding, noncontaminated (reference) area. Altogether, these women had 1,014 babies between 1994 and 2005. The babies born to the mothers living in the reference area had lower neonatal blood TSH levels on average than the babies born to mothers living in zone A; zone B babies had intermediate TSH levels. Zone A babies were 6.6. times more likely to have a TSH level of more than 5 μU/ml than the reference area babies (the threshold TSH level for further investigations is 10 μU/ml; the average TSH level among the reference area babies was 0.98 μU/ml). The researchers also examined the relationship between neonatal TSH measurements and maternal dioxin measurements at delivery (extrapolated from measurements made between 1992 and 1998) in 51 mother–baby pairs. Neonatal TSH levels were highest in the babies whose mothers had the highest blood dioxin levels.
What Do These Findings Mean?
These findings suggest that maternal dioxin exposure has a long-lasting, deleterious effect on neonatal thyroid function. Because the long-term progress of the children in this study was not examined, it is not known whether the increases in neonatal TSH measurements associated with dioxin exposure caused any developmental problems. However, in regions where there is a mild iodine deficiency (the only environmental exposure consistently associated with reduced human neonatal thyroid function), TSH levels are increased to a similar extent and there is evidence of reduced intellectual and physical development. Future investigations on the progress of this group of children should show whether the long-term legacy of the Seveso accident (and of the high environmental levels of dioxin elsewhere) includes any effects on children's growth and development.
 
Additional Information.
 
Please access these Web sites via the online version of this summary at http://dx.doi.org/10.1371/journal.pmed.0050161.

Introduction
 
Variations in neonatal thyroid function evaluated at birth through blood thyroid-stimulating hormone (b-TSH) are associated with changes in iodine availability and maternal intake [1]. According to the World Health Organization (WHO), the percentage of newborns with b-TSH > 5 μU/ml should be less than 3% in iodine-replete populations [1]. Aside from iodine deficiency, no environmental exposure has been conclusively associated with reduced neonatal thyroid function in humans [24].
2,3,7,8-Tetrachlorodibenzo-p-dioxin (TCDD), a ubiquitous low-level contaminant of the environment, is the most toxic compound of a class of toxicologically related environmental chemicals, including other dioxins, polychlorinated biphenyls (PCBs), furans, brominated compounds, polycyclic aromatic hydrocarbons, and organochlorine pesticides [5,6]. High exposures to dioxin have occurred in chemical workers, Vietnam Agent Orange veteran sprayers, accidental rice oil contamination, and recent isolated poisonings such as those of the current Ukrainian president and of five coworkers, two of whom had particularly elevated exposures, in Vienna, Austria [6]. In animal models, maternal exposure to TCDD induces elevated b-TSH and neonatal primary hypothyroidism [2,710], as TCDD and other related compounds have been shown to accelerate thyroid hormone clearance by increasing metabolic enzyme activity and competing with plasma binding proteins [1113].
In July 1976, an industrial accident caused the exposure to high TCDD doses of a large population living in a residential area in Seveso, Italy [14]. TCDD has an extremely long half life, particularly in women (∼10 y) [15,16], and was still elevated in plasma samples collected from exposed women 20 y after the accident [17,18]. Starting in 1994, we conducted two related investigations on the Seveso population to determine whether maternal exposure to TCDD, as well as elevated current plasma dioxin levels, modified neonatal thyroid function.

Material and Methods
 
Study Population
After the Seveso accident, three zones of decreasing contamination (A, B, and R) were delimited based on TCDD soil concentrations. TCDD concentrations measured in soil samples shortly after the accident were between 15.5–580.4 μg/m3 in zone A, 1.7–4.3 μg/m3 in zone B, and 0.9–1.4 μg/m3 in zone R [19]. A cohort including all individuals living in these three zones (804 in A, 5,941 in B, 38,624 in R) and in the surrounding noncontaminated area (reference, 232,740 individuals) was established for follow-up studies [20]. Measurements performed on plasma samples collected from individuals within 1–2 y after the accident showed median TCDD levels of 447 parts per trillion (ppt) in zone A (n = 296), 94 ppt in zone B (n = 80), and 48 ppt in zone R (n = 48) [21]. A later campaign performed in 1993–1994 still found elevated plasma levels in the exposed population, particularly among women, who had geometric means of plasma TCDD equal to 60.5 ppt in zone A individuals (n = 7), 17.6 ppt in zone B individuals (n = 51), and 6.1 ppt in individuals from the reference area (n = 52) [17].
Seveso is located in Lombardy, a region in northern Italy with a population of 9.2 million where thyroid function is tested in all newborns by b-TSH measurements. Blood samples for b-TSH screening are taken 72 h after birth from a heel prick directly onto filter paper using a standardized collection protocol, and shipped to the Milan Central Neonatal Screening Laboratory where b-TSH determination is performed by fluorometric immunoassay using the AutoDELFIA automatic immunoassay system (PerkinElmer). Since January 1, 1994, all b-TSH data have been recorded in the Neonatal Screening Registry, which also includes information on date and hospital of birth, weight at birth, and place of residence. The two investigations described below received approval from the institutional review board of the Ospedale Maggiore Policlinico, Mangiagalli, Regina Elena Foundation, Milan, Italy.
Residence-Based Population Study
We selected from the Seveso cohort all the 1,772 women from the highly contaminated areas (A and B) who were: (i) residents of zone A (n = 186) or B (n = 1,231) at the time of the accident (July 10, 1976), or potentially exposed to TCDD, because they moved into zone A (n = 27) or B (n = 328) between July 10, 1976 and December 31, 1979; (ii) of fertile age (i.e., date of birth after December 31, 1947); (iii) alive on January 1, 1994. We randomly sampled 1,772 nonexposed women from the eligible (ii and iii above) female participants (n = 55,576) of the cohort established from the population of the reference area. Nonexposed women were frequency-matched to the exposed women by year of birth and residence in the reference area on the date of the accident (i above). We contacted 472 population registry offices (PROs) of the towns of residence of the women to identify all children born to the study participants. PRO personal records were traced for 1,761 (99.4%) of the 1,772 women from the contaminated areas (A and B) and 1,762 of the 1,772 women (99.4%) from the reference area. Because b-TSH measurements for our study were obtained from the Lombardy Neonatal Screening Registry, we excluded all children born outside the Lombardy region (n = 156; 13.3% of the 1,170 children traced). After such exclusion, we identified 1,014 singletons (56 from zone A, 425 from zone B, 533 from reference) born between January 1, 1994 and June 30, 2005 to 42 women from zone A, 327 women from zone B, and 403 women from the reference area. For all births, we obtained information on the type of delivery (vaginal, cesarean) through the regional registration system of hospital discharges. In a case-control study on chloracne conducted between 1993–1998 [22], most of the participants with elevated (> 0 ppt) TCDD levels were from zones A and B, whereas only a minority of zone R individuals exhibited elevated TCDD. In planning the residence-based population study, we hypothesized that contrasting newborns from zones A and B to the reference area would have provided us with the most efficient design to detect potential effects of the exposure on neonatal TSH. Thus, the zone R population was not included in the study.
Study Based on Plasma Dioxin Measurements
We conducted a second investigation on the children born to the 109 women of fertile age (date of birth after 31 December 1947) who were part of the Seveso Chloracne Study [22]. The original population sample included 211 male and female healthy participants representative of the Seveso population and 101 individuals who had developed chloracne, the skin disorder associated with TCDD toxicity [22]. All participants gave written informed consent. Between 1 January 1994 and 30 June 2005, 51 children (12 from zone A, ten from zone B, 20 from zone R, and nine from Reference) were born to 38 of the 109 women; the remaining 71 women did not give birth in the study period. All children from zones A and B were also part of the residence-based population study (which included all zone A and B women), while none from zone R and reference had been sampled in the residence-based population study. Dioxin measurements were performed at the Centers for Disease Control (Atlanta, Georgia, United States of America) using a high-resolution gas chromatography/high-resolution mass spectrometry analysis [23] on plasma samples collected between December 1992 and September 1998. Specifically, 24 congeners were measured, including TCDD and six additional dibenzo-p-dioxins (PCDDs), ten dibenzofurans (PCDFs), and four coplanar PCBs. Starting from 1996, 36 non-coplanar PCBs, including six mono-ortho congeners, were added to the panel of the congeners we tested for. Non-coplanar PCBs were thus measured on a subset of 37 of the 51 mother–child pairs. Results are reported in ppt, lipid adjusted. For women with concentrations > 10 ppt, plasma TCDD were extrapolated to the date of delivery with a first-order pharmacokinetic model [24,25], using the elimination rate estimated in Seveso (equivalent to 9.8 y half-life for women) [15].
Toxic equivalent concentrations (TEQs) were defined for a mixture of dioxin-like compounds as the product of the concentration of each congener multiplied by its specific toxic equivalency factor (TEF) [26]. Maternal mean TCDD levels were 18.9 ppt (n = 51, range 1.4–309.5). Mean plasma TEQs were 44.8 ppt (n = 51, range 11.6–330.4) for PCDDs, PCDFs, and coplanar PCBs; and 1.8 ppt (n = 37, range 0.6–4.2) for non-coplanar PCBs. Total mean TEQs, including the sum of PCDDs, PCDFs, coplanar PCBs, and non-coplanar PCBs, were 41.8 ppt (n = 37, range 12.2–334.5). Although total TEQs also included TEQs from PCDDs, PCDFs, and coplanar PCBs, mean total TEQs were lower than TEQs from PCDDs, PCDFs, and coplanar PCBs. Measurement of non-coplanar PCBs, which were used to compute total TEQs, started later during the study (from 1996). Thus, total TEQs are available only on a subset of participants who have lower dioxin levels likely because of later sampling and longer time from the accident.
Statistical Analysis
b-TSH levels were log-transformed to approximate normal distribution. Consequently, geometric b-TSH means and 95% confidence intervals (CIs) are shown. Graphical distributions of b-TSH were plotted using the Epanechnikov kernel function to obtain density estimates. In descriptive analyses, we used the Fisher's exact test to evaluate associations of the study participant's general characteristics with the zone of residence, and the Student t-test for associations with b-TSH. We calculated correlations with b-TSH and tests for trend using linear regression analysis. Unconditional logistic regression was used to estimate relative odds of elevated b-TSH levels (> 5 μU/ml). In multivariate analyses, regression models included gender, birth weight, birth order of the newborn, and maternal age at delivery, hospital, and type of delivery as independent variables. Correlation between siblings was accounted for by using generalized estimating equations in all models. However, in analysis testing differences for groups that included a very small number of observations, generalized estimating equation may produce unreliable results. In such instances, we used either the Fisher's exact test (for categorical outcomes) or Wilcoxon (Mann-Whitney) nonparametric test (for continuous outcomes), as indicated in the text. We performed tests for trend across contamination zones by scoring the areas using the logarithm of the geometric means of plasma TCDD (60.5 ppt in zone A, 17.6 ppt in zone B, and 6.1 ppt in the reference area) measured in female participants in a previous investigations conducted between 1993–1995 [17]. In the residence-based population study, results did not show major differences after excluding 66 children (6.5% of the total sample size) born to mothers who had moved into the study areas after the date of the accident. Results including all newborns are reported throughout the paper. In the study based on plasma dioxin, information on additional possible confounders, including maternal body mass index (BMI), smoking habits, alcohol consumption, and neonatal age in hours at b-TSH measurement, was available. In multivariate models, inclusion or exclusion of these variables did not modify statistical significance. To confirm the results of linear models, we used Spearman's rank-correlation statistics. In both studies, statistical significance was not modified by adding to the models indicator variables for year of birth to adjust for trends through the study period. All tests were two-sided. All analyses were performed in Stata 9.0 (Stata Corporation).

Results
 
Residence-Based Population Study
The characteristics of the newborns included in the residence-based population study were similar across the three contamination zones (Table 1). Neonatal b-TSH values ranged between 0.2 and 14.0 μU/ml. Mean neonatal b-TSH decreased with increasing birth weight (p = 0.03), consistent with previous observations on other populations [27], and showed moderate, nonsignificant variations in association with birth order and type of delivery (Table 1). Mean neonatal b-TSH levels were significantly higher in the populations who lived in the TCDD-contaminated zones at the time of the accident (Table 2). Mean b-TSH was 0.98 μU/ml (95% CI 0.90–1.08) in the reference population, 1.35 μU/ml (95% CI 1.22–1.49) in zone B, and 1.66 μU/ml (95% CI 1.19–2.31) in zone A, the most contaminated area (p < 0.001 for trend across zones). Distributions of b-TSH by contamination zones are shown in Figure 1. The proportion of newborns with b-TSH > 5 μU/ml (Table 3) was equal to 2.8% in the reference area, 4.9% in zone B, and 16.1% in zone A (p < 0.001). Compared to the reference area, the relative odds of elevated b-TSH increased through contamination zones, with odds ratio (OR) = 1.79 (95% CI 0.92–3.50) for zone B and OR = 6.60 (95% CI 2.45–17.8) for zone A (p = 0.002 for trend across zones).
Table 1
Table 1
Characteristics, Dioxin Contamination Zones, and Neonatal b-TSH Levels of the Mother–Child Pairs Included in the 1994–2005 Seveso Population-Based Study
Table 2
Table 2
Neonatal b-TSH Levels in Children Born between 1994 and 2005 to Women from Zone A (the Zone Most Contaminated after the Seveso Accident), Zone B, and the Surrounding Noncontaminated Area (Reference)
Figure 1
Figure 1
Distribution of Neonatal b-TSH by Dioxin Contamination Zone
Table 3
Table 3
Frequency and Relative Odds of Elevated Neonatal b-TSH Levels (> 5 μU/ml) in Children Born between 1994 and 2005 to Women from Zone A (the Zone Most Contaminated after the Seveso Accident), Zone B, and the Surrounding Noncontaminated Area (more ...)
Eight of the newborns in our study (three from reference [0.6%], four from zone B [0.9%], and one from zone A [1.8%]) had b-TSH levels > 10 μU/ml, which is set as the recall threshold for further laboratory and clinical investigations in Lombardy Region. Two of them (one from zone B [0.2%] and one from zone A [1.8%]) had b-TSH > 10 μU/ml twice in recall tests and were eventually diagnosed with congenital primary hypothyroidism (p = 0.049, Fisher's exact test). The remaining five children all had b-TSH < 5 μU/ml at the first recall and did not undergo further testing.
In this study, 228 women had more than one child during the study period. We calculated the difference in b-TSH between the first and the second child (228 pairs), and between the third and the second child (14 pairs). Neonatal b-TSH decreased with time between one birth and the next (Table 4) in the contaminated zones (A and B), while no decrease was found in the reference population (p = 0.03 for the interaction with zones).
Table 4
Table 4
Variations in Neonatal b-TSH between Siblings born between 1994 and 2005 to Women from Zone A, the Zone Most Contaminated after the Seveso Accident, Zone B, and the Surrounding Noncontaminated Area (Reference)
All results in the residence-based population study were similar after adjustment for gender, birth weight, birth order, maternal age at delivery, hospital, and type of delivery (Tables 24).
Study Based on Plasma Dioxin Measurements
Maternal TCDD levels estimated at the date of delivery were positively associated with neonatal b-TSH (n = 51, standardized regression coefficient [β] = 0.47, p < 0.001, Figure 2A). When also other dioxin congeners were considered, a similar correlation was found with plasma TEQs for PCDDs, PCDFs, and coplanar PCBs (n = 51, β = 0.45, p = 0.005, Figure 2B), but not with non-coplanar PCBs (n = 37, β = 0.16, p = 0.45, Figure 2C). Multivariate regression models adjusting for gender, birth weight, birth order, maternal age at delivery, hospital, and type of delivery confirmed the association of neonatal b-TSH with plasma TCDD (β = 0.75, p < 0.001), PCDDs, PCDFs, and coplanar PCBs (β = 0.68, p < 0.001), and the lack of significant correlation with non-coplanar PCBs (β = 0.24, p = 0.46). When the sum of all total TEQs from the measured compounds was considered, the correlation with neonatal b-TSH levels was not significant in the crude analysis (n = 37, β = 0.31, p = 0.14, Figure 2D). However, in the multivariate analysis the correlation was significant (β = 0.65, p < 0.001).
Figure 2
Figure 2
Plasma Dioxin Levels and Neonatal b-TSH
We performed regression diagnostics and sensitivity analyses to evaluate the role of influential data points in our analyses [28]. We identified influential points using either of the following two criteria: (i) largest Cook's distance (highest 5%); (ii) strongest impact on the TSH-exposure slope (highest 5% difference). Including or removing single influential points did not modify the strength of the positive association between maternal plasma levels of dioxins and neonatal TSH levels, which always remained statistically significant at the 0.05 level. All positive associations were dependent on the presence in the analyses of participants with very high plasma TCDD level (> 50 ppt, n = 5). When the analysis was restricted to individuals with TCDD ≤ 50 ppt, none of the correlations described above was statistically significant. However, when the nonparametric Spearman's rank correlation coefficients (rs) were used, the associations of b-TSH with plasma TCDD (rs = 0.28, p = 0.04), and TEQs for PCDDs, PCDFs, and coplanar PCBs (rs = 0.33, p = 0.02) were significant.
When newborns were divided by contamination zones, correlations with neonatal b-TSH were highest in zone A for both plasma TCDD (rs = 0.70, p = 0.01), and PCDD, PCDFs, and coplanar PCBs (rs = 0.82, p = 0.001), whereas in the reference area no association was found with either plasma TCDD (rs = −0.25; p = 0.51), or PCDD, PCDFs, and coplanar PCBs (rs = 0.11, p = 0.78).
The analyses described above were based on plasma TCDD levels that, for women with plasma TCDD > 10 ppt, were extrapolated to the date of delivery using a first-order pharmacokinetic model. Using the measured TCDD concentrations in place of the extrapolated levels affected the results only marginally. In particular, neonatal b-TSH levels exhibited significant associations in multivariable models with plasma TCDD (β = 0.68, p = 0.002); plasma TEQs for PCDDs, PCDFs, and coplanar PCBs (β = 0.60, p = 0.004); and sum of all total TEQs (β = 0.65, p = 0.001).
As shown in Table 5, plasma dioxin levels were significantly higher in newborns with b-TSH > 5 μU/ml. Plasma TCDD was 5.2 ppt (95% CI 4.1–6.7) in newborns with b-TSH ≤ 5 μU/ml and 39.0 ppt (95% CI 8.9–173) in those with b-TSH > 5 μU/ml (p = 0.005). Plasma TEQs for PCDDs, PCDFs, and coplanar PCBs were 30.6 ppt (95% CI 26.9–34.8) in newborns with b-TSH ≤ 5 μU/ml and 88.9 ppt (95% CI 43.1–183.5) in those with b-TSH > 5 μU/ml (p = 0.002). Also, in the group with b-TSH <5 μU/ml non-coplanar PCBs levels (1.5 ppt, 95% CI 1.2–1.8) and the sum of all TEQs (29.2 ppt, 95% CI 25.3–33.5) were significantly different from the levels found in the in the group with b-TSH ≥ 5 μU/ml (2.9 ppt, 95% CI 1.8–4.6, p = 0.003 for non-coplanar PCBs; 84.5 ppt, 95% CI 16.7–427.8, p = 0.01 for the sum of all TEQs).
Table 5
Table 5
Plasma Levels of Dioxin Compounds by b-TSH Levels (≤ 0.5 μU/ml or > 5 μU/ml)

Discussion
 
Neonatal b-TSH, which is used in most countries to screen for congenital hypothyroidism, is considered a sensitive marker of subclinical primary hypothyroidism and a suitable index of the presence of factors causing thyroid enlargement and potential alterations in function [1,29,30]. Our results from the Seveso population showed that newborns of mothers with high body burdens of TCDD, resulting from accidental dioxin exposure occurring approximately 20–30 y earlier, had higher neonatal b-TSH concentrations compared to newborns of nonexposed women.
In our residence-based population study, we observed a shift in the distribution of b-TSH toward higher levels in the exposed groups, thus suggesting that dioxin exposure may produce effects that are detectable at the population level. Mean b-TSH levels increased through the contamination zones, with proportions of b-TSH > 5 μU/ml in the highly contaminated areas (zones A and B) equivalent to those associated with mild iodine deficiency (3%–19.9% according to the WHO) [1]. Epidemiological studies conducted in areas with mild to moderate iodine deficiency have demonstrated, even in the absence of clinical hypothyroidism, abnormalities in psychoneuromotor and intellectual development, including impairment of visual-motor performances, motor skills, perceptual and neuromotor abilities, as well as reduced development and intellectual quotients (IQs) [29,31]. Postnatal cognitive and motor alterations have also been described in children with perinatal exposure to dioxin-related compounds [10,3236]. At the individual level, only eight of the newborns in our study had b-TSH levels > 10 μU/ml, which is commonly set as the recall threshold for further laboratory and clinical investigations for congenital hypothyroidism. After further testing, two children from the contaminated areas and none from the reference were diagnosed with primary hypothyroidism. Our residence-based population study did not include women from the zone R area, which was an area with low-level and patchy contamination, representing a circular strip between the highly contaminated zones (A and B) and the surrounding reference area [20]. Additional research is warranted to determine whether neonatal thyroid function has been altered by dioxin exposure in the zone R population.
Plasma dioxin has been shown in Seveso and elsewhere to decrease exponentially with time in individuals with high body burdens, while it is nearly constant in individuals with background exposure [24,25]. In our residence-based population study, the analysis of changes in neonatal b-TSH between siblings from the contaminated zones showed that b-TSH was higher in the first child and tended to decrease with time in subsequent children. No time-related decrease in b-TSH was seen in the reference zone. This finding provides indirect evidence for a decrease of dioxin effects on neonatal b-TSH in conjunction with the time-related elimination of dioxin described in Seveso [15].
Our analysis based on dioxin plasma measurements confirmed the results of the residence-based population study and permitted directly confirming the presence of a positive correlation of b-TSH levels with current plasma TCDD estimated at birth, as well as with TEQs of dioxin-like compounds. Persistence of elevated TCDD levels more than 20 y after the accident and the relative strength of the associations suggest that TCDD was the main factor driving the relation between dioxin plasma concentrations and b-TSH. Exposure in the Seveso accident was predominantly to TCDD [14], and the associations we observed with neonatal b-TSH may reflect differences in exposure dose, as well as differential susceptibility in the infants to dioxin effects.
Our results, showing higher b-TSH levels in TCDD-exposed individuals, are consistent with animal investigations indicating that maternal TCDD exposure induces elevated b-TSH and neonatal primary hypothyroidism [2,7,8,10]. Previous investigations in humans, which have measured thyroid function and TCDD exposure in mother–child pairs from the general population, produced inconsistent results [2]. Initial reports from the Netherlands suggested that infants born to mothers with PCDD, PCDF, and PCB concentrations on the higher side of the population range had higher plasma TSH levels [37,38]. Koopman-Esseboom et al. [37] evaluated thyroid function on 78 breast-fed children at 2 wk and 3 mo of age. Levels of 22 individual PCDD, PCDF, and PCB congeners, measured in human milk samples, were within background levels and correlated with higher infant plasma TSH levels measured in the second week and third month after birth. In a subsequent study, Pluim et al. [38] performed thyroid function tests in 38 breast-fed infants at birth and at 1 and 11 wk of age and found that infants breast-fed with milk containing TEQ levels above the median of the study group had higher mean plasma TSH at 11 wk of age, relative to infants below the TEQ median, while TSH levels were not different between the two groups at birth and 1 wk after birth.
After these two initial reports, a series of studies have been conducted that have not confirmed the association between dioxin exposure and thyroid function alterations [3942]. The largest of these studies, which was conducted in Japan, showed no association of serum TSH, total T4, total T3, and free T4, measured on 337 breast-fed children at 1 y of age, with breast milk TEQ background levels from 41 PCDD, PCDF, and PCB congeners measured in maternal breast milk 30 d after birth [40]. More recently, a study conducted on a sample of 118 children from the general population of central Taiwan found in female newborns, but not in males, a negative correlation between cord b-TSH levels and TEQ levels from 17 PCDDs and PCDFs, and 12 PCBs measured in placental tissue [43]. A recent study that used the CALUX assay as a functional measure of dioxin activity on cord blood samples from 198 newborns observed a negative association with cord blood free T3 and T4, but not with TSH [44]. Several studies have investigated the association of dioxin-like and nondioxin-like PCBs on neonatal thyroid function [4549], reporting associations of either the sum of PCBs or individual congeners with decreased free T4 [45,46], and increased TSH [47,48], though between-study consistency was limited. Discrepancies among studies may reflect random variability in investigations with relatively small sample size, likely to have insufficient statistical power to detect potentially subtle effects from low dioxin and PCB levels [2]. Our study has the advantage of being based on a unique population of women exposed to high levels of TCDD. Our residence-based population study included over 1,000 newborns whose thyroid function was evaluated after birth using standardized procedures for blood sample collection, handling, and shipment. b-TSH measurements were performed at a single laboratory using the same method throughout the study period. Our study based on plasma dioxin measurements, which was conducted on a smaller sample of individuals only partially included in the residence-based population study, allowed for the evaluation of the dose-effect relationship with different metrics of dioxin concentrations. Because b-TSH levels were measured 72 h after birth in our study, mother–child dioxin transfer through colostrum [50] might have contributed to the correlation in our study between dioxin and b-TSH levels. Because we did not have information on breast-feeding before b-TSH measurement, the relative contribution of dioxins from colostrum could not be assessed in our study.
In all our analyses, we adjusted for potential determinants of b-TSH or TCDD levels, including gender, birth weight, birth order of the newborn, maternal age at delivery, hospital, and type of delivery. Information on other determinants of neonatal b-TSH levels, including maternal iodine intake, was not available. However, there is no indication that exposed and unexposed women had differences in iodine intake in the study period. The exposed and referent populations all lived in a relatively small geographical area at the time of the accident [14]. Changes of residence immediately after the accident and later migrations determined geographical distributions across the Lombardy region that are very similar for the exposed and reference populations [14,20,51]. All available indicators have shown close comparability in terms of social, educational, cultural, and occupational characteristics, as well as of access to health-care services, including primary care physicians, obstetrics and gynecology specialists, and hospital care [14,20,52]. In addition, by adjusting for hospital of birth in multivariable models, we also controlled in our analyses for geographical location, thus reducing the likelihood that systematic environmental differences in iodine concentrations between exposed and reference populations might have biased the results.
Our findings from the Seveso population indicate that maternal exposure to persistent environmental contaminants such as TCDD produces effects on neonatal thyroid function that may occur far apart in time from the initial exposure. To clarify the clinical significance of our findings, further investigation on developmental outcomes after maternal dioxin exposure is warranted.
 
Supporting Information
 
Alternative Language Abstract S1: Translation of Abstract into Italian by Andrea Baccarelli
(22 KB DOC)
Alternative Language Abstract S2: Translation of Abstract into Spanish by Alice Marta Croce
(36 KB DOC)
Alternative Language Abstract S3: Translation of Abstract into Chinese by Zhu Zhongzeng
(21 KB DOC)
Abbreviations

b-TSHblood thyroid-stimulating hormone

CIconfidence interval

ORodds ratio

PCBpolychlorinated biphenyl

PCDDdibenzo-p-dioxin

PCDFdibenzofuran

pptparts per trillion

TCDD2,3,7,8-Tetrachlorodibenzo-p-dioxin

TEQtoxic equivalent

WHOWorld Health Organization
Footnotes
 
Author contributions. AB, CC, DC, and PAB designed the study. SMG and PG organized and performed the data collection. MTL, DGP, ACP, and PAB designed and established the Seveso Choracne Study. MB performed data analyses under AB's and DC's supervision. AB and SMG wrote the manuscript. AB, SMG, CC, MTL, MB, DC, PG, DGP, ACP, and PAB contributed to the data interpretation and critical revision of the manuscript.
Funding: This work was supported by the following Research Grants: Italian Ministry of University and Scientific Research (MIUR) Internationalization Program 2004–2006/97-C, and CARIPLO Foundation and Lombardy Region Research Contracts numbers UniMi 8614/2006 and UniMi 9167/2007. The study sponsors had no role in the study design; collection, analysis, and interpretation of data; writing of the paper; and decision to submit it for publication.
Competing Interests: The authors have declared that no competing interests exist.
 
PLoS Med. 2008 July; 5(7): e161.
Published online 2008 July 29. doi: 10.1371/journal.pmed.0050161.
PMCID: PMC2488197