Tooth width measurement using the Lythos digital scanner

73 © Australian Society of Orthodontists Inc. 2017 Introduction: Digital models have become more widely accepted for orthodontic diagnostic purposes. Intraoral scanners have the advantage of eliminating the need for conventional impressions. The aim of the present study was to assess the reliability and reproducibility of the Lythos intraoral scanner and to determine if a significant advantage is delivered over stone model and caliper measurements in tooth width and Bolton ratio accuracy. Methods: The study comprised 30 typodont models for which conventional alginate impressions and digital scans were obtained to generate stone and digital models, respectively. Mesiodistal tooth width measurements and Bolton ratios were obtained with either calipers and stone models or with Digicast (Ormco Lythos digital model software, Ormco, CA, USA) software using digital models. Pearson’s correlation coefficients tested intra-examiner reliability. Interclass correlation coefficients were used to assess agreement between examiners (reproducibility). The differences in the mean tooth width measurements and Bolton ratios from the typodont values and either the digital or conventional method were used to assess validity by applying two tailed t-tests. Results: The measurements obtained from the Lythos and stone models had near perfect intra-examiner agreement (Pearson ≥ 0.98). The inter-examiner reproducibility for tooth widths, anterior Bolton and overall Bolton ratio was high and similar for both methods (Lythos scanner Interclass correlation coefficient (ICC) above 0.89, stone models ICC above 0.92). Stone model measurements were statistically twice as accurate as those derived from the digital system (0.032 mm versus 0.074 mm). There were no significant differences in accuracy between the methods for Bolton calculations. Clinically, there was no difference between the methods for tooth width measurements and Bolton calculations. Conclusions: The Lythos system is as reliable and reproducible as conventional calipers and stone models in tooth width measurements and Bolton calculations. The caliper method presents a statistically more valid tooth width measurement technique but the clinical significance of this is questionable. (Aust Orthod J 2017; 33: 73-81)


Introduction
Comprehensive records and thorough treatment planning are essential in achieving successful orthodontic outcomes. Proportional tooth size compatibility between the upper and lower arches is a clinical feature that needs to be assessed during orthodontic pretreatment assessment. If initially undiagnosed, it may unfavourably affect the finished result. 1 The Bolton analysis is the best known proportional tooth size analysis and is traditionally performed on stone study models using Vernier calipers. The accuracy of this method has been tested and shown to be valid, reliable and reproducible leading to its adoption as the 'gold standard' in determining proportional tooth width discrepancies. 2,3 There has been a trend towards digital records, including clinical notes, photographs and radiographs. More recently, digital models that have several advantages have been developed and incorporated into orthodontic practices. Electronic storage and retrieval of study models reduces physical storage space, prevents damage or loss of models and allows Michael Bowes: michael@northsideorthodontics.com.au; William Dear: willdear78gmail.com ; Emily Close: emily@twbaortho.com.au; Terrence Freer: T.Freer@uq.edu.au the easy transfer of records and the ability to view models in multiple places simultaneously. 4,5 Digital files allow for immediate information exchange for consultation or referral. 6 Furthermore, digital models can be virtually manipulated, magnified and crosssectional views can be created. 5 Three-dimensional models can be printed if required for the construction of orthodontic appliances. 6 The files can be easily electronically dispatched for use by study clubs or for consultations. 7 Advances in optics and computing have led to the development of direct intraoral scanning, which has the potential to make conventional impressions obsolete. Eliminating conventional impressions has the benefit of reducing errors associated with air bubbles, displacement and movement of the tray, tray distortion or distortion from disinfection procedures or the setting of the materials themselves. 5 The ability to avoid conventional impressions in patients with severe gag reflexes, cleft lip and palate or those at risk of aspiration is a significant benefit to an orthodontic practice. 5 Although the validity, reliability and reproducibility of direct intraoral scanners has been confirmed in the literature, 5,6,8 the statistical differences between digital and conventional methods have not been investigated to determine which method provides better accuracy.
No previous study has assessed the Lythos intraoral scanner (Ormco, CA, USA), which was released in 2013. The Lythos system captures data without the pre-powdering of teeth using autofluorescence imaging (AFI technology) to obtain data in real time (versus post-process stitching), acquiring a highdefinition detail scan at all angulations of the tooth surface. The Lythos system captures 2.5 million 3D data points per second, which results in a rapid single high-resolution scan. This data is then sent to Ormco for the development of digital models, which can be analysed using their proprietary Digicast software (Ormco, CA, USA).
The aim of the present study was to determine the validity, reliability and reproducibility of the Lythos scanner in determining tooth width and Bolton ratios. Differences will be analysed to determine which method, either digital or conventional, delivers the more accurate results.

Materials and methods
The study was conducted on typodont models. The sample size was based on a significance level of 0.05 to detect a mesiodistal tooth width difference of 0.5 mm between model types with a power of 80%. This revealed that a sample size of 27 was required. The sample size was increased to 30 to match previous related studies within the field.
Thirty typodont models were constructed using varying combinations of acrylic denture teeth of nine different brands from six different suppliers. This ensured that a wide range of tooth shapes and sizes was available to create different Bolton ratios on each set of models ( Figure 1). All teeth were included from first molar to first molar in both arches and the teeth were of normal anatomy and undamaged. The typodonts were set up with 10 cases each of mild (<4 mm), moderate (4 -8 mm) and severe (8+ mm) crowding in an attempt to mimic the findings of clinical practice. A test scan of one typodont model was done under supervision of the Ormco Lythos technical representative to ensure that the acrylic teeth and wax could be readily identified in the scans. Subsequently, the scanning of all 30 models was undertaken utilising the Lythos digital impression system ( Figure 2). Following typodont scanning, the images were uploaded onto Ormco's secure web-based 'cloud' to enable construction of the digital study models.
Impressions of the same typodont setups were also obtained using stock trays and alginate material (Kromopan™, Florence, Italy). The impressions were poured in orthodontic dental stone (Whip mix™, KY, USA) within three hours, and the impressions and models were checked for impression or casting defects. The individual teeth were removed from the typodont setups to allow visualisation from all directions and unimpeded measurement of the maximum mesiodistal tooth diameters using a digital caliper (Mitutoyo, Tokyo, Japan). The mesiodistal tooth widths were defined as the maximum mesiodistal distance between anatomical contact points if the teeth were aligned, made parallel to the occlusal and labial or buccal surfaces of the teeth. 6 The measurements then became the true value of tooth widths against which measurements from the digital and caliper methods were tested.
Mesiodistal tooth widths and Bolton calculations were undertaken on the digital study models and the stone casts. The stone casts were measured using digital calipers which had their points sharpened to enable placement within the embrasure areas of the teeth ( Figure 3). Caliper measurements were recorded to 1/100th of a millimeter. The digital study models were measured using Ormco's Digicast software (CA, USA). The digital models were manipulated as required (rotated and magnified), and were measured using a personal computer with a 21.5" display and a standard mouse (HP Compaq 8200 Elite, Hewlett-Packard, CA, USA). The software limits measurements  to increments of 1/10th of a millimeter. The models were measured in a random order. All of the models from one capture method were measured first, followed by the second method, and the models were measured in batches of no more than five to minimise examiner fatigue. All measurements were entered directly into an Excel™ spreadsheet (Microsoft, WA, USA).
The primary examiner, a senior orthodontic resident, measured each set of models twice, separated by a period of two weeks to prevent recall bias. The additional examiners, an orthodontist and an experienced general dentist, measured each set of models once using both measurement methods. The examiners were blinded to the identity of the models by assigning a random number label to the stone model and a random letter label to the digital model. Houston (1983) described validity as 'the extent to which, in the absence of measurement error, the value obtained represents the object of interest.' 9 This is the same as accuracy, which reflects the exact measurement of an object. Measurements of validity take place against a gold standard. 10 The validity of digital models and stone casts in the present study was tested against the tooth width measurements of the typodont teeth.
Reproducibility, as defined by Houston, is how close successive measurements of the same object are to one another. 9 It is a measure of the ability of a measurement to be reproduced by a second examiner. 8 In the present study, reproducibility was assessed by comparing the closeness of measurements obtained by the three examiners. Reliability represents the consistency of measurements under identical conditions. 10 This was evaluated by a comparison of the repeated measurements by the primary examiner.

Statistical analysis
Statistical analysis was performed using Stata (version 12.1; StataCorp, TX, USA). The normality of the data was assessed by histograms of the differences in mean tooth width measurements between the typodonts and either the stone models (caliper) or digital methods. Histograms of individual tooth width differences and overall measurements showed a normal distribution.
Reproducibility was performed using the interclass correlation coefficient (ICC) including a two-way random-effects model with absolute agreement. Measurements by all three examiners from both methods were used in this calculation. The association between replicate measurements (reliability) was assessed using the Pearson product-moment coefficient (r).
As the reproducibility and reliability were both found to be high, the measurements from the three examiners were used to calculate the validity of the measurements made on each tooth. Validity was assessed using the mean value obtained from each method employed by the three examiners. A two-tailed paired t-test was used to compare the recordings made using the digital method and the caliper method with those measurements obtained from the typodont. This was done for each tooth width, the mean tooth width, the anterior Bolton ratio and the overall Bolton ratio. The level of significance was set at p ≤ 0.05.
Clinical significance was defined as 0.5 mm for individual tooth width measurements and 2 mm for Bolton ratios, in keeping with literature trends. 8

Results
There were significant differences between the tooth widths obtained using both measurement methods and the true typodont value (p < 0.05). On average, the widths were 0.074 mm less using the digital method and were 0.032 mm less using the caliper method. The differences between measurement methods are displayed in Table I. More statistically significant differences in tooth width sizes were found using the digital method compared with the caliper method (13 teeth compared to 11). Of the seven teeth that had upper limits outside the clinically acceptable range, only the caliper-measured lower left first molar appeared to be a systematic error due to an increased measuring bias of one examiner on this tooth. The smallest mean difference was found for the lower right canine, while the upper right first premolar had the largest mean difference overall. The variances of measurements (standard deviations) were least for the lower left lateral incisor and were greatest for the upper right second premolar. Only 11 out of the 1440 mean tooth width measurement differences were clinically different (>0.5 mm), representing 0.76% of the data. Six were produced by the caliper method and the remaining five were derived from the digital method.
There were no statistically significant or clinically significant differences when calculating the Bolton ratios using either method (Table II). Using the calipers produced a smaller mean difference and range when calculating the anterior Bolton ratio. The digital method had a slightly smaller mean value for overall Bolton ratio, but also had a larger range.
The reliability was high for both measurement techniques. Table III displays the differences between the primary examiner's replicate measurements, and the p values for the differences between replications. Approximately 40% of the repeated measurements for the caliper method were statistically different (p < 0.05), whereas there were no significant differences generated by the digital method. The computer method had a smaller mean difference compared with the caliper method (0.008 mm versus 0.021 mm). The Pearson correlation coefficients were 0.99 for both methods, indicating high concordance between repeated measurements of tooth widths (Table IV). The scatterplots of the differences between repeated measurements show the second measurement was smaller using both methods and there was a tight grouping of measurement differences (Figures 4 and  5). The Pearson correlation coefficient values were 0.98 and above for both methods when calculating Bolton ratios, again indicating high concordance.
The inter-examiner reproducibility was high for both methods. Table V shows the ICC values for both methods. The caliper method was more consistent at calculating tooth widths and overall Bolton ratios, and the digital method was more consistent at calculating the anterior Bolton ratio.

Discussion
This was the first investigation that evaluated the validity, reliability and reproducibility of the Lythos digital scanner and associated software in determining tooth width sizes and calculating Bolton ratios. As this study was conducted on typodont teeth whose mesiodistal tooth widths were measured individually, a direct comparison between digital methods and caliper methods was able to be undertaken to determine which technique provided a more robust method of tooth width measurement and Bolton ratio calculation.
The digital method and the caliper method both produced statistically significant differences in tooth width measurements and Bolton calculations. Although these differences were small, the caliper technique was slightly more reliable. Luu et al., 11         difference in two-point measurements as 0.5 mm, which also agrees with the level of clinical significance set by Naidu and Freer. 8 When a threshold for clinical significance of 0.5 mm for tooth width measurements is applied, both methods tested in the present study are clinically valid. Both methods tended to record tooth width measurements that were smaller than the typodont values. The limits of agreement with both methods were below the clinically significant difference of 0.5 mm. Only 0.8% of tooth width measurements fell outside the clinically acceptable range using the caliper method and 0.7% using the computer method. These results compare favourably with previous studies within the field, which have reported absolute mean differences up to 0.38 mm. 12 Studies comparing measurements made on digital models with those made on stone models have shown that digital measurements can vary compared with their conventional counterparts. 8,13 Previous studies have found that measurements made on stone casts tended to overestimate measurements taken directly intraorally. 14,15 The present study found that both methods tended to underestimate the true tooth widths. The reasons for this could include: (1) distortion associated with the impression material prior to pouring the models; 16 (2) the process of scanning and recording data points and algorithms used in the construction of digital models may affect the dimensional accuracy of the digital models; (3) although there was a calibration period, the examiner's inexperience taking measurements may have affected the results recorded; (4) typodont teeth were measured individually and the maximum mesiodistal width was easily identified. These areas may have been obscured by localised crowding or tooth angulations on the digital and stone models.
The lower incisors generally have the smallest range of measurement error and are the most accurately measured teeth. Being more square in shape, it is hypothesised that these teeth will provide more consistent landmark identification. This is in contrast to the molars, which, being more rounded, potentially present more difficulty in consistently identifying the same measurement landmarks. The larger mean differences and standard deviations of the molars and premolars would support this contention.
The anterior Bolton ratio was 0.029% smaller than the typodont value when using caliper measurements, and the overall Bolton was 0.089% larger. When calculated by the digital method, the anterior and overall Bolton ratios were overestimated by 0.31% and 0.071% respectively. The Bolton ratios were also assessed as differences in millimeter values to determine clinical relevance. Endo et al. recommended that discrepancies in excess of 2 mm be regarded as clinically significant. 17 Furthermore, Othman and Harradine suggested that a 2 mm discrepancy is clinically acceptable. 18 Although the mean Bolton discrepancies using both measurement methods in the present study were neither clinically nor statistically significant, the upper and lower limits of the digital method fell outside of the clinically acceptable range.
Although both methods are acceptable techniques for measuring tooth widths and calculating Bolton ratios from a clinical perspective, the caliper method provided more accuracy. The small difference in statistical significance may be a product of the limitation of the measuring increment within the Lythos system. The Lythos system allows measurements to 1/10th of a millimeter, whereas the caliper was accurate up to 1/100th of a millimeter.
Pearson correlation coefficients between the primary examiner's repeated measurements were high for both methods for tooth width measurements (r = 0.99) and Bolton ratio calculations (r ≥ 0.98). This indicates that each method is highly reliable. Using the caliper method, approximately 40% of the tooth width measurement means were statistically different on the second measurement. However, the computer method had no statistically different values between repeated measurements. The Bolton ratios using both methods were reliable as determined by the Pearson correlation coefficient. However, the standard deviations were higher than those observed for tooth width measurements. This suggests that the range of measurements is much wider related to Bolton ratios than those obtained from individual tooth width measurements. This outcome is not unexpected, as the errors from the individual tooth width measurements that are culminated in the Bolton calculation will lead to larger differences in the Bolton ratio. Intra-examiner reliability is excellent for both methods and the digital method appears to generate more consistent values.
The digital and caliper methods resulted in high ICC values, which indicated that each method was reproducible between examiners. ICC values above 0.75 have been described as excellent by Roberts and Richmond. 10 The ICC values for both methods in the