Open access

Towards sustainable North American wood product value chains, part 2: computer vision identification of ring-porous hardwoods

Publication: Canadian Journal of Forest Research
4 August 2022

Abstract

Wood identification is vitally important for ensuring the legality of North American hardwood value chains. Computer vision wood identification (CVWID) systems can identify wood without necessitating costly and time-consuming off-site visual inspections by highly trained wood anatomists. Previous work by Ravindran and colleagues presented macroscopic CVWID models for identification of North American diffuse porous hardwoods from 22 wood anatomically informed classes using the open-source XyloTron platform. This manuscript expands on that work by training and evaluating complementary 17-class XyloTron CVWID models for the identification of North American ring porous hardwoods — woods that display spatial heterogeneity in earlywood and latewood pore size and distribution and other radial growth-rate-related features. Deep-learning models trained using 4045 images from 452 ring-porous wood specimens from four xylaria demonstrated 98% five-fold cross-validation accuracy. A field model trained on all the training data and subsequently tested on 198 specimens drawn from two additional xylaria achieved top-1 and top-2 predictions of 91.4% and 100%, respectively, and images devoid of earlywood, latewood, or broad rays did not greatly reduce the prediction accuracy. This study advocates for continued cooperation between wood anatomy and machine-learning experts for implementing and evaluating field-operational CVWID systems.

Résumé

L’Identification des essences de bois revêt une importance cruciale pour assurer la légalité des chaînes de valeur des feuillus nord-américains. Les systèmes d’identification des essences de bois par vision informatique (IEBVI) permettent d’identifier les essences de bois sans nécessiter d’inspections visuelles hors sites coûteuses en temps et en argent par des anatomistes du bois très qualifiés. Des travaux antérieurs réalisés par Ravindran et ses collègues ont présenté des modèles macroscopiques d’IEBVI pour l’identification de feuillus poreux diffus nord-américains de 22 catégories de bois informées sur le plan anatomique en utilisant la plateforme de source ouverte XyloTron. Le présent manuscrit pousse plus loin ces travaux en formant et en évaluant 17 modèles complémentaires d’IEBVI de la catégorie XyloTron pour l’identification des feuillus poreux diffus nord-américains, des essences qui affichent une hétérogénéité spatiale dans la taille et la distribution des spores du bois de printemps et du bois d’été et d’autres caractéristiques reliées au taux de croissance radial. Des modèles d’apprentissage profond formés en utilisant 4045 images de 452 spécimens de bois poreux de quatre xylarias ont montré une précision de validation croisée quintuple de 98 %. Un modèle de terrain formé sur toutes les données de la formation et testé subséquemment sur 198 spécimens tirés de deux xylarias additionnels ont permis de réaliser les deux plus importantes prévisions de 91,4 % et 100 % respectivement et les images dépourvues de bois de printemps, de bois d’été ou de larges rayons n’ont pas réduit considérablement la précision des prévisions. La présente étude prône une coopération continue entre les experts en anatomie du bois et en apprentissage machine pour la mise en oeuvre et l’évaluation des systèmes d’IEBVI opérationnels sur le terrain. [Traduit par la Rédaction]

1. Introduction

Wood identification can be of vital importance for designing, monitoring, and establishing sustainable wood product value chains and for ensuring legality under laws and policies governed by international treaties (e.g., the Convention on the International Trade in Endangered Species of Flora and Fauna) as well as national laws and policies (e.g., the United Statesʼ Lacey Act., and 2012 Illegal Logging Prohibition Act of Australia). Wood identification is traditionally performed by wood anatomy experts in a laboratory setting and relies on the ability of human experts to recognize and differentiate anatomical features. Recently, to tackle the paucity of traditional wood identification expertise (Wiedenhoeft et al. 2019), computer vision wood identification (CVWID) systems have been applied both in the laboratory and in the field to address the challenge of identifying wood without a trained expert’s eye (Khalid et al. 2008; Martins et al. 2013; Filho et al. 2014; Figueroa-Mata et al. 2018; Ravindran et al. 2018, 2019, 2021; Damayanti et al. 2019; de Andrade et al. 2020; Ravindran and Wiedenhoeft 2020; Souza et al. 2020). The open-source XyloTron platform (Ravindran et al. 2020, 2021) has shown potential for real-time, field-deployable, screening-level wood identification (Ravindran et al. 2019, 2021; Ravindran and Wiedenhoeft 2020; Arévalo et al. 2021), and with the XyloPhone (Wiedenhoeft 2020), it is possible to move from laptop-based devices to smartphones for field deployment. Both the XyloTron and XyloPhone platforms provide an imaging system that enable the capture of macroscopic features (Miller et al. 2002; Ruffinatto et al. 2015) suitable for wood identification.
Designing high-performing, scalable CVWID systems requires understanding wood anatomy and how that anatomy influences the training, performance, and deployability of convolutional neural networks (CNN) (Ravindran et al. 2022) or other machine-learning-based models (de Geus et al. 2021). Hwang and Sugiyama (2021) report the classification accuracy of numerous CNN models used in wood identification studies, with most prior works demonstrating a high in silico accuracy of 90% and better with similar performance across different architectures, but most of those studies do not report any subsequent model testing on new, unique specimens, so their real-world applicability is unknown. It may be the case that for CVWID the number of classes, number of training images (coverage of anatomical variation), quality of specimen surface preparation (visibility of anatomical features), quality of images (clarity of anatomical features), the size of the area imaged vis-à-vis the scale of diagnostic anatomical features, and the degree to which the anatomical features among the classes are similar are all likely important factors for CNN architecture design and eventual field performance of trained models. For this reason, it is vitally important to attempt to evaluate how wood anatomy at a range of scales affects imaging and CVWID model performance.
Ravindran et al. (2022) estimated that approximately 40 classes of North American hardwoods need to be included in a field-deployable computer vision model for the North American market, a number substantially greater than anything previously published for this region, either in terms of macroscopic images (Lopes et al. 2020, 10 classes) or at the naked-eye level (Wu et al. 2021, 11 classes). As noted in Ravindran et al. (2022), the influence of class number on CVWID models is unknown, especially for North American hardwoods, where there are, broadly speaking, two wood anatomically distinct groups of woods — the diffuse-porous woods and the ring-porous woods. They therefore used a fundamental domain-specific factor, porosity, to inform taxa selection and label space design. In general, diffuse-porous woods show less wood anatomical spatial heterogeneity with regard to radial growth rate, growth ring domains (earlywood vs latewood), and physiological age of the wood (Ravindran et al. 2022). Diffuse-porous woods of North America also show comparatively lower overall wood anatomical variability (e.g., axial parenchyma patterns, vessel arrangement, and ray width and frequency), than, for example, diffuse-porous tropical woods (e.g., de Andrade et al. 2020; Arevalo et al. 2021), or compared with the latewood of ring-porous North American woods (Fig. 1). Ravindran et al. (2022) therefore separated the North American hardwoods into two groups: the diffuse-porous woods of the earlier work and the ring-porous woods addressed herein.
Fig. 1.
Fig. 1. Images of the transverse surfaces of Quercus alba specimens with varying growth rates: slow, medium, and fast growth. Images A and B show medium-growth with approximately three complete growth rings. Image B lacks broad rays that are necessary for identifying Quercus. Images C and D are each missing important anatomical features that would allow for accurate identification. As a result of fast radial growth, image C shows a partial, latewood-only growth ring, thus not demonstrating ring porosity. Due to the slow growth conditions, image D displays the relative absence of latewood features, precluding the ready separation of the white oak group from the red oak group. Note also in image D, the ring-porous character of the wood is less obvious as a result of the closely spaced growth rings. Each image represents 6.35 mm of tissue on a side. [Colour online]
Unlike diffuse-porous hardwoods, ring-porous hardwoods, by definition, show dramatic differences between earlywood and latewood within a growth ring and among species (Fig. 1). Due to the spatial heterogeneity displayed by ring-porous woods, it is possible, depending on the area of tissue captured and the respective sizes of the earlywood and latewood regions, to obtain an image that does not exhibit all the anatomical characteristics that typify the wood. Fast radial growth can result in images that show only latewood (Fig. 1C), that is, only the latter-formed portion of a single growth ring. Tangentially varying features (e.g., broad rays in Quercus; Fig. 1B) may be absent in some images. Slow radial growth can produce an image that is primarily earlywood (Fig. 1D). The impact of such spatial heterogeneity as reflected in test images is unknown and unexplored. An initial work purporting to use CVWID to classify ten ring-porous North American hardwoods did not appear to consider spatial heterogeneity related to wood anatomy (Lopes et al. 2020). Furthermore, the apparently subpar image quality of that dataset was first questioned (Wiedenhoeft 2020) and later the machine-learning analysis and underlying dataset were demonstrated to be inherently flawed based on data hygiene for CVWID inference (Ravindran and Wiedenhoeft 2022).
In this study, we develop a CVWID model to identify 17 classes of North American ring-porous woods using the XyloTron platform and a CNN. In addition to performance evaluation for accuracy and domain-informed examination of model misclassifications, we investigate the influence of wood anatomical spatial heterogeneity of ring-porous woods on specimen level model predictions and discuss how other forms of wood anatomical heterogeneity are thus potentially capable of influencing model performance in field deployment settings. Finally, we propose a path for future research for developing a robust, highly accurate, field-deployable, unified North American hardwood model.

2. Materials and methods

2.1. Dataset details

2.1.1. Taxa and sample selection

We selected 68 North American ring-porous hardwood species from 15 prominent genera based on their commercial importance, botanical relevance, and specimen availability from five scientific wood collections and forensically verified specimens from a wood anatomical teaching collection. Table 1 summarizes the details of these six collections and their specimen contributions.
Table 1.
Table 1. Four xylaria, one teaching collection (MSUtw), and a set of scientifically collected, georeferenced stem discs (fscquercus) provided specimens for image datasets for the training and testing of the wood identification models. MADw, SJRw, fscquercus, and Tw specimens contributed solely to the training dataset. In contrast, the independent test dataset was obtained from specimens acquired from the PACw and MSUtw specimens, the class-level identifications of the latter confirmed by laboratory analysis.

2.1.2. Sample preparation and imaging

The transverse surface of 650 wood specimens was polished using sanding discs with progressively finer abrasive grit (240, 400, 600, 800, 1000, 1500). Between each grit, compressed air and adhesive tape were used to remove dust from the cell lumina to the extent possible. It should be noted that the aqueous polishing method of Barbosa et al. (2021) is not suitable for entire xylarium specimens, as it would tend to damage historic specimen labels, induce swelling-related checking, cause extractive movement or staining, and (or) a combination of all the above. Our progressive sanding protocol provided a repeatable method for consistently preparing uniform specimen surfaces for imaging. Multiple nonoverlapping images of the transverse surface of each wood sample were captured with the XyloTron platform (Ravindran et al. 2020). Each image had a resolution of 2048 × 2048 pixels and captured an area of tissue that measured 6.35 mm × 6.35 mm with a linear resolution of 3.1 microns/pixel. Multiple instantiations of the XyloTron system along with multiple operators with varying degrees of experience in sample preparation and knowledge of wood anatomy (undergraduate students, graduate students, postdoctoral researchers, and technical specialists) were utilized for sample preparation and image capture. The resulting images were subsequently curated for image quality and the presence of representative anatomical characteristics. Table 2 shows a summary of the collected datasets.
Table 2.
Table 2. Summary of image datasets.

2.1.3. Label assignment

According to Gasson (2011), the light microscopic identification of wood specimens is generally accurate only to the genus level. In this study, we categorized the selected taxa into a combination of generic and subgeneric classes based on the similarity of macroscopic anatomical characteristics to facilitate machine learning and for use on the XyloTron platform. We grouped the taxa into 17 classes in the following ways:
1.
The genera Asimina, Carya, Castanea, Catalpa, Celtis, Cladrastis, Fraxinus, Gleditsia, Gymnocladus, Maclura, Morus, Robinia, and Sassafras were each assigned to a genus-level class.
2.
The genera Quercus and Ulmus were each split into two classes. Quercus classes were labeled “QuercusR” (red) and “QuercusW” (white) corresponding to the commercial red and white oak groups, which are anatomically distinguishable on the transverse surface on the basis of differences in latewood pore diameter and distribution. Ulmus classes were labeled “UlmusS” (soft) and “UlmusH” (hard) based on commercial grouping and continuous and discontinuous row(s) of earlywood vessels, respectively, and differences in the mean radial and tangential earlywood vessel diameter (Wheeler et al. 1989).
Although class names include genus names, we follow a convention of not italicizing the class names, so that we can distinguish when we are discussing genera or species (which are italicized) versus class names.
Supplement S1 contains a list of the 68 taxa, their class labels, and their training and testing dataset membership.

2.1.4. Spatial heterogeneity datasets

In addition to the 936 images that comprised the main testing dataset shown in Table 2, three smaller datasets were collected to evaluate the effects of spatial heterogeneity on model accuracy (hereainfter the “spatial heterogeneity datasets”, Table 3). From the 192 PACw specimens imaged for the main test dataset, 38 specimens were selected that exhibited (i) especially slow radial growth (narrow, closely spaced growth rings), (ii) especially fast growth (wide growth rings), or (iii) large areas devoid of broad rays (in Quercus). These specimens were reimaged in areas that contained entirely earlywood (to generate the Slow-Growth dataset), virtually no earlywood (to generate the Fast-Growth dataset), or that lacked broad rays (to generate the Broad Rays Absent dataset). The three resulting datasets thus each lacked at least one characteristic wood anatomical feature used by human identifiers to characterize the woods in question.
Table 3.
Table 3. Summary of spatial heterogeneity image datasets.
The classes included in the Slow-Growth dataset are Carya, Cladrastis, Gleditsia, Morus, QuercusR, and QuercusW. The Fast-Growth dataset included classes Catalpa, Cladrastis, Morus, QuercusR, QuercusW, Robinia, and UlmusS. Specimens in QuercusW were the only ones to display images lacking broad rays. Not all classes were included in these datasets due to the absence of specimens in some classes featuring distinctly slow or fast growth. Table 3 summarizes the number of images contained in each dataset.

2.2. Machine-learning details

2.2.1. Model architecture and training

Prior work (e.g., Ravindran et al. 2019, 2020, 2021; Arevalo et al. 2021) has demonstrated the effectiveness of using a two-stage (Howard and Gugger 2020) transfer learning (Pan and Yang 2010) approach for training strong baseline CNN models for CVWID. This training approach was employed here to learn the weights of a ResNet34-based CNN with a custom classifier head that can handle 17 classes. The custom classifier head consisted of global average and global maximum pooling layers, which were concatenated and fed through two fully connected layers (with batch-normalization (Ioffe and Szegedy 2015) and dropout (Srivastava et al. 2014)) in sequence. This was followed by a soft-max layer that produced class prediction distribution over the 17 classes. In the first stage of training, the ImageNet (Russakovsky et al. 2015) pretrained weights of the backbone were frozen and only the weights of the custom head were learned. During the second stage, the weights in both the backbone and the head were fine-tuned. Data augmentation, that included reflections, rotations, and CutOut (DeVries and Taylor 2017), was performed during training. The learning rate hyperparameter was estimated using the one-cycle policy of Smith (2018) and was annealed (Howard and Gugger 2020) when training using the Adam optimizer (Kingma and Ba 2017). Details about the model architecture, training methodology, hyperparameter optimization, and data augmentation can be found in Ravindran et al. (2022). PyTorch (Paszke et al. 2019) and scientific Python tools (Pedregosa et al. 2011) were used for model definition, training, and evaluation. Additional details of a ResNet50 model trained and evaluated identically to the ResNet34 model are available in Supplement S2.

2.2.2. Model evaluation

The following analyses were conducted for evaluation of trained models:
1.
Training and evaluation were performed using five-fold cross-validation analysis with class level stratification folds along with specimen-level separation among the five folds (i.e., images of each specimen contributed images to exactly onefold). For valid assessment of any wood identification machine-learning-based classifier, it is necessary to conduct specimen level mutual exclusivity between the folds (e.g., Ravindran et al. 2019, 2020, 2021, as discussed in Hwang and Sugiyama 2021, and in Ravindran and Wiedenhoeft 2022). A confusion matrix and the corresponding top-1 and top-2 prediction accuracies were computed by consolidating the model predictions over the five folds. It should be reiterated that the cross-validation analysis did not include images from the PACw or MSUtw datasets.
2.
The five models from the cross-validation analysis, each trained using a different 80% split of the training data, were also tested on the PACw + MSUtw dataset. The top-1 and top-2 accuracies for this analysis are also presented.
3.
A field model was trained using all the images from the cross-validation analysis (i.e., 100% of the training data) and evaluated on images from the PACw + MSUtw dataset. A confusion matrix and the top-1 and top-2 prediction accuracies were computed to assess the utility of the field model.
The predicted top-1 class for a specimen was taken to be the majority of class predictions for the images contributed by the specimen. The top-2 prediction for a specimen was generated by equal weight voting of the top-2 predictions for images from the specimen: if a specimen’s true class was one of the top-2 predicted classes, the specimen was considered correctly identified.

2.2.3. Misclassification analysis

Images from all misclassified specimens from the field models were evaluated and assigned to one of three types of misclassification as reported in detail in Ravindran et al. (2022), and we also adopt their source and sink misclassification analysis as implemented therein.

2.2.4. Spatial heterogeneity evaluation

The impact of spatial heterogeneity on model performance was evaluated using the three datasets obtained from the PACw specimens (see Section 2.1.4). Table 4 lists the classes and the number of specimens per class that comprise each of the spatial heterogeneity datasets (Slow-Growth, Fast-Growth, and Broad Rays Absent).
Table 4.
Table 4. Summary of the number of specimens and their class labels included in each of the spatial heterogeneity datasets.

3. Results

The top-1 prediction accuracy for the specimen level cross-validation model was 98.0%. When tested on the PACw + MSUtw dataset, the top-1 and top-2 cross-validation accuracies were 91.9% and 98.3%, respectively. The field model top-1 accuracy was 91.4%, and the top-2 accuracy was 100%. Table 5 shows the summary of the cross-validation (accumulated over the five folds) and field model’s prediction accuracies. Confusion matrices for the cross-validation and field models are shown in Figs. 2 and 3, respectively.
Table 5.
Table 5. Training and testing specimen level model prediction accuracies.
Fig. 2.
Fig. 2. Confusion matrix for the cross-validation model top-1 predictions on 452 specimens (accumulated over five folds), with a specimen-level accuracy of 98.0%.
Fig. 3.
Fig. 3. Confusion matrix for the field model’s top-1 predictions on 198 specimens in the PACw + MSUtw dataset. Specimen-level accuracies for top-1 and top-2 predictions were 91.4% and 100%, respectively.
Example images of Type 1 and Type 3 misclassifications from the field model’s confusion matrix (Fig. 3) are shown in Fig. 4. A summary of misclassification data for the field model is presented in Table 6.
Fig. 4.
Fig. 4. Images of the transverse surface of test specimens from classes Gleditsia and QuercusR (A and C) and exemplar images from classes Gymnocladus (B) and UlmusH (D). Images A and B show Type 1 misclassification where a specimen of Gleditsia was misclassified to the anatomically similar class Gymnocladus. An anatomically typical specimen of the class QuercusR (C) was misclassified as the anatomically disparate class UlmusH (D), a Type 3 misclassification. Note the anatomical similarities between images A and B and the anatomical dissimilarity between images C and D, especially with regard to the difference in ray size, earlywood vessel diameter and arrangement. Also, in images C and D, the arrows indicate a possible comparison of banded parenchyma in the latewood of QuercusR (C) and ulmiform latewood vessel arrangement in UlmusH (D), which may have accounted for these misclassifications. Each image represents 6.35 mm of tissue on a side. [Colour online]
Table 6.
Table 6. Number and proportion of misclassified specimens from Fig. 3 by type of misclassification.
For the top-1 accuracy of the field model, 11 classes showed zero source misclassifications on the PACw + MSUtw dataset: Asimina, Carya, Castanea, Catalpa, Celtis, Fraxinus, Gymnocladus, Morus, Robinia, Sassafras, and UlmusH. At least one source misclassification was shown in the remaining six classes (Fig. 3), with 17 misclassified specimens of 198 test specimens in total. Six classes provided source misclassifications, and those misclassified specimens were attributed to the five following classes: Gymnocladus, QuercusR, QuercusW, Robinia, and UlmusH. There were five classes that drew sink misclassifications: Gymnocladus, QuercusR, QuercusW, Robinia, and UlmusH. Eight classes showed neither source nor sink misclassifications: Asimina, Carya, Castanea, Catalpa, Celtis, Fraxinus, Morus, and Sassafras. Table 6 summarizes the number and proportions of misclassification types. Fifteen of the 17 (88.2%) misclassifications were Type 1. There were only two out of 17 (11.8%) misclassifications that were of Type 3, and there were zero Type 2 misclassifications.
When tested against the three spatial heterogeneity datasets, prediction accuracy of the field model remained nearly unchanged at 91.3% in the case of the Slow-Growth dataset, and fell by 11.4% for the Fast-Growth dataset and 8.3% for the Broad Rays Absent (QuercusW) dataset. Of the Slow-Growth dataset, a specimen from class Cladrastis was predicted as Gymnocladus and a specimen from the class QuercusW was predicted as QuercusR. Within the Fast-Growth dataset, two specimens from the class UlmusS were predicted as UlmusH. The Broad Rays Absent dataset, which consisted of the class QuercusW, had one of six specimens misclassified as QuercusR. Table 7 summarizes the accuracies for the three spatial heterogeneity datasets when tested with the field model. A comparison of the test specimen and an example image of the predicted class of each spatial heterogeneity dataset is shown in Fig. 5.
Table 7.
Table 7. Specimen-level field model performance metrics on spatial heterogeneity datasets. Top-1 field model accuracy on typical images was 91.4% (Fig. 3).
Fig. 5.
Fig. 5. Images of the transverse surface of test specimens from classes Cladrastis, QuercusW, and UlmusS (A, C, and E) along with exemplar images from classes Gymnocladus (B), QuercusR (D), and UlmusH (F). Image pairs A and B, C and D, and E and F illustrate the misclassification within the Slow-Growth, Broad Rays Absent, and Fast-Growth spatial heterogeneity datasets. In image D the arrow indicates a broad ray. In image F the arrow indicates a line of earlywood vessels. [Colour online]

4. Discussion

4.1. Deployment gap of cross-validation and field testing

The deployment gap, the drop in accuracy (Ravindran et al. 2021) between the top-1 cross-validation and field-testing accuracy when tested on PACw and MSUtw specimens, was 6.6%. Previous studies by Ravindran et al. (2019, 2021) found deployment gaps of 25.0% and 10.5%, respectively, and in the diffuse-porous North American hardwoods, a deployment gap of 14.6% was reported (Ravindran et al. 2022). Research in other fields of computer vision have found a comparable loss in accuracy when models are tested on completely new datasets (Recht et al. 2019; Zech et al. 2018). According to Recht et al. (2019), there is a strong likelihood that models will struggle to generalize to images that present greater challenges than those in the original dataset. Other factors described in Ravindran et al. (2021) that might influence this deployment gap include minor variations in the anatomical patterns between xylarium specimens and the wood currently available in the market, differences between green and dry wood, variability in operator skills, and (or) systematic differences imparted by different instantiations of imaging equipment. Using CVWID models in human-in-the-loop scenarios, such as the xyloinf classification software (Ravindran et al. 2020), provides users with the top predictions and exemplar images for the predicted classes, permitting the incorporation of human judgment. Additionally, organoleptic characters unavailable to the CVWID system, such as odor, luster, and taste, could serve as initial indicators of wood identity as well as assist in identification of Type 3 misclassifications by visual comparison of field images with representative images.
CVWID is typically formulated as an inductive-learning problem where a model () that maps images (from ) to labels (from ) is learned using labeled training data and the quality of the trained model is evaluated on its capability to generalize to unseen test data. The test data are assumed to be independent and identically distributed (i.i.d.) from the same distribution as the training data, i.e., if Pt and Pd represent the training and deployment data distributions, then and . When the i.i.d. assumption is violated and distributional shifts between the training and testing/deployment data exist, in silico model performance will not translate to commensurate real-world field performance. Two types of distributional shifts can influence the real-world performance of deployed CVWID models: covariate shift () and semantic shift ().
Differences in wood anatomy, sample preparation, imaging parameters, and operator skill can be sources of covariate shift. In this work, the use of the XyloTron platform to image progressively sanded wood specimen surfaces enabled the capture of consistent image data thereby minimizing covariate shifts due to specimen preparation and imaging. Our study pooled data from different wood collections (using multiple operators for specimen preparation and imaging) to train a model which was then tested on specimens from a different xylarium that did not contribute to the training data (a logistically manageable surrogate for real-world field testing), thereby enforcing separation between the training and testing datasets. The surrogate field testing approach used here (similar to Ravindran et al. 2022) is an initial attempt to incorporate covariate shifts due to operator skill (while following the progressive sample preparation protocol) in the evaluation of trained models. In the context of covariate shifts in relation to spatial heterogeneity of ring-porous woods, the Slow-Growth, Fast-Growth, and Broad Rays Absent datasets were used to evaluate model performance with respect to the positioning of the imaging sensor on the specimen surface.
While our dataset is the largest (in terms of number of images and unique specimens) for the considered classes and leads to models that are deployable in a human-in-the-loop setting, the observed deployment gap for the field model suggests the need for larger datasets for training models that better capture the inter- and intra-class wood anatomical variations. Our models (like most prior works, but see Apolinario et al. 2019) were trained and evaluated based on a closed-world assumption, i.e., there is no semantic shift, and the test specimen belongs to one of the classes the model was trained to identify. Augmented models that include a larger set of woods along with a “catch all” out-of-distribution class and (or) reporting prediction uncertainties can be practical approaches for handling semantic shifts in the data distribution. Elucidating the interplay of dataset sizes, model capacities, and distributional shifts, especially relaxation of sample preparation protocols (e.g., using knife cuts instead of progressive sanding, or sanding to a coarser grit [thus involving fewer steps and less time]) and the closed-world assumption (Scheirer et al. 2013), is likely to be an important challenge in the realization of general field-deployable CVWID systems. We expect the exploration of these ideas (e.g., Mahdavi and Carvalho 2021; Vaze et al. 2021; Yang et al. 2021) to be a fertile area for future work.

4.2. Spatial heterogeneity

According to Panshin and de Zeeuw (1980) and Hoadley (1990), the initial discriminant macroscopic character commonly used in North American hardwood identification is porosity (ring-porous vs diffuse-porous). The second character frequently invoked in such wood identification keys within the ring-porous hardwoods is the characteristic presence of wide rays in Quercus. By definition, vessels in ring-porous woods will display dramatic and abrupt changes in diameter between earlywood and latewood, as well as changes in the parenchyma patterns from earlywood to latewood. It has been found that specimens with varying growth rates (slow, medium, and fast growth (Fig. 1)) can have an impact on the appearance of both these features such that some macroscopic images of fast-grown specimens may not capture the earlywood vessels, some images of Slow-Growth specimens may not capture latewood features, and some images of Quercus may lack wide rays. Our spatial heterogeneity datasets explicitly tested the influence of these features (or their lack), and the results suggest that this may not affect model predictions as strongly as anticipated (no change for Slow-Growth, 11.4% reduction for Fast-Growth, and 8.3% reduction for Broad Rays Absent). In the Slow-Growth dataset, a specimen from class Cladrastis was misclassified as class Gymnocladus, a Type 1 misclassification that was also observed in the five-fold dataset. Of the Fast-Growth dataset, all specimens in class UlmusS were predicted as UlmusH. This outcome was likely caused by the absence of the earlywood zone, where the prominent, continuous and sometimes multiple rows of earlywood vessels would have served to separate UlmusS from UlmusH. Interestingly, none of the images of QuercusW captured without broad rays (e.g., Fig. 1B) was mistaken for class Castanea despite the latter's striking resemblance to the former in absence of this distinguishing feature. Figure 5 shows an example comparison of misclassifications in each of the three spatial heterogeneity datasets.

4.3. Analysis of misclassifications

Table 8 presents a summary of the misclassifications in the field model based on the confusion matrix in Fig. 3. Source misclassification proportions were calculated on a per class basis (n = number of specimens in each class) and sink misclassification proportions were calculated as a percentage of the total number of misclassified specimens across all classes (n = 17). Our analysis found eight classes that have neither source nor sink misclassifications; they include Asimina, Carya, Castanea, Catalpa, Celtis, Fraxinus, Morus, and Sassafras. As for classes exhibiting Type 1 source misclassifications, they ranged from a 2.4% in QuercusR to 87.5% in UlmusS. The only Type 3 source misclassification came from QuercusR, where 4.9% of the specimens were misclassified as UlmusH. Among the five classes with sink misclassifications, UlmusH and QuercusR account for over 60% of the Type 1 misclassifications. It is noteworthy that UlmusH was the only class to show a Type 3 misclassification, with 11.8% of specimens classified as Quercus R.
Table 8.
Table 8. Proportions of misclassifications in the top-1 predictions in the field model by class.
When analyzing the field-model confusion matrix (Fig. 3), two of 49 QuercusR specimens were classified as UlmusH. As noted above, this is classified as a Type 3 misclassification, but in reviewing the misclassified images, it might be that the features recognized by the model are emphasizing the relative similarity between the wavy bands of latewood parenchyma in QuercusR and the ulmiform latewood vessel arrangement in UlmusH, and thus are failing to emphasize the importance of earlywood vessel arrangement, ray abundance, and ray width. A human identifier would be unlikely to mistake these features for each other and thus would be unlikely to confuse these woods. This comparison can be seen in Fig. 4 (images C and D).

4.4. Toward a unified model for North American hardwoods

In continuity with our previous study on diffuse-porous hardwoods (Ravindran et al. 2022), we are working to develop a unified CVWID model for North American commercial woods that covers the entire spectrum of porosity patterns. In addition to ring- and diffuse-porous woods, a unified model must incorporate semi-ring-porous woods into the label space design. As CVWID for semi-ring-porous woods is still unexplored, it represents a type of label space heterogeneity that requires parsing before a unified model can be developed.
As future studies lead us closer to model unification, it becomes increasingly important to evaluate model performance in a way that assesses accuracy and most closely approximates model deployment in the field. While CVWID models are commonly subjected to five-fold cross-validation, the extent to which such models developed from inadequate (at best) and potentially misleading (at worst) datasets can misrepresent accuracy in the absence of testing with independent specimens has been demonstrated (Ravindran and Wiedenhoeft 2022). To be sure, testing models with specimens from disparate xylaria not used for model training has been a useful and convenient surrogate for field testing, but the gold standard for CVWID model evaluation remains on-the-ground testing of actual commercial material.
As the number of classes increases to cover more of the commercial hardwood species in North America (somewhere around 40+), we expect to see the overall frequency of misclassifications increase, as well as the frequency of Type 3 misclassifications. Whereas a human trained in wood identification would rarely (if ever) mistake a ring-porous wood for a diffuse-porous wood, a 40 + class CVWID model might. For this reason, it is important to develop large, unified models in such a way as to reduce or eliminate those types of misclassifications. In addition to incorporating domain expertise in designing the label space, this improvement could possibly be accomplished by varying CNN depth, applying penalty weights for out-of-genus misclassifications, or even nesting models inside others.

5. Conclusions

The CVWID model presented here is one of the first, and the largest, to be developed for ring-porous hardwoods in North America. A 17-class model was trained using 4045 images captured from 452 specimens of ring-porous woods from four xylaria to determine how well the model handles spatial heterogeneity. A five-fold cross-validation showed a 98.0% accuracy while a field model tested on 198 specimens drawn from two additional xylaria achieved top-1 and top-2 predictions of 91.4% and 100%, respectively. Results tested on three smaller spatial heterogeneity datasets also showed that images devoid of earlywood, latewood, or broad rays did not greatly reduce prediction accuracy. This study emphasizes the continued importance of allowing wood anatomy to inform CVWID model creation and evaluation and advises against relying solely on CNN architecture to increase accuracy. In an ongoing study, we are working on developing a combined model for North American diffuse-porous and ring-porous hardwoods that will also examine how semi-ring-porous hardwoods (such as Juglans) affect the predictions of the model and the possibilities of enabling computer-vision models to predict classes in a more anatomically informed way.

Acknowledgements

The authors wish to gratefully acknowledge the specimen preparation and imaging efforts of Nicholas Bargren, Karl Kleinschmidt, Caitlin Gilly, Richard Soares, Adriana Costa, and Flavio Ruffinatto.

Code availability

The software apps for image dataset collection and trained model deployment along with the weights of the trained model will be made available at https://github.com/fpl-xylotron.

References

Apolinario M.P.E., Urcia Paredes D.A., Huaman Bustamante S.G. 2019. Open set recognition of timber species using deep learning for embedded systems. IEEE Latin Am. Trans. 17(12): 2005–2012.
Arévalo B. R.E., Pulido R. E.N., Solórzano G. J.F., Soares R., Ruffinatto F., Ravindran P., 2021. Imaged based identification of colombian timbers using the xylotron: a proof of concept international partnership. Colomb. For. 24(1): 5–16.
Barbosa A.C.F., Gerolamo C.S., Lima A.C., Angyalossy V., Pace M.R. 2021. Polishing entire stems and roots using sandpaper under water: an alternative method for macroscopic analyses. Appl Plant Sci. 9(5): aps3.11421.
Damayanti R., Prakasa E., Krisdianto, Dewi L.M., Wardoyo R., Sugiarto B., 2019. LignoIndo: image database of Indonesian commercial timber. IOP Conf. Ser.: Earth Environ. Sci. 374(1): 012057.
de Andrade B.G., Basso V.M., de Figueiredo Latorraca J.V. 2020. Machine vision for field-level wood identification. IAWA J. 41(4): 681–698.
de Geus A.R., Backes A.R., Gontijo A.B., Albuquerque G.H.Q., Souza J.R. 2021. Amazon wood species classification: a comparison between deep learning and pre-designed features. Wood Sci. Technol. 55: 857–872.
DeVries T., Taylor G.W. 2017. Improved Regularization of Convolutional Neural Networks with Cutout. [cs]. Available from http://arxiv.org/abs/1708.04552 [accessed 15 March 2022].
Figueroa-Mata G., Mata-Montero E., Valverde-Otarola J.C., Arias-Aguilar D. 2018. Automated Image-based identification of forest species: challenges and opportunities for 21st century xylotheques. In 2018 IEEE International Work Conference on Bioinspired Intelligence (IWOBI). IEEE, San Carlos. pp. 1–8.
Filho P.L.P., Oliveira L.S., Nisgoski S., Britto A.S. 2014. Forest species recognition using macroscopic images. Mach. Vis. Appl. 25(4): 1019–1031.
Gasson P. 2011. How precise can wood identification be? Wood anatomy's role in support of the legal timber trade, especially cites. IAWA J. 32(2): 137–154.
Hoadley R.B. 1990. Identifying wood: accurate results with simple tools. Taunton Press, Newtown, CT, USA. p. 223.
Howard J., Gugger S. 2020. Fastai: a layered API for deep learning. Information, 11(2): 108.
Hwang S.-W., Sugiyama J. 2021. Computer vision-based wood identification and its expansion and contribution potentials in wood science: a review. Plant Methods, 17(1): 47.
Ioffe S., Szegedy C. 2015. Batch normalization: accelerating deep network training by reducing internal covariate shift. Proceedings of the 32nd International Conference on Machine Learning, May, San Diego, CA. pp. 448–456.
Khalid M., Lew E., Lee Y., Yusof R., Nadaraj M. 2008. Design of an intelligent wood species recognition system. Int. J. Simul. Syst. Sci. Technol. 9: 9–19.
Kingma D.P., Ba J. 2017. Adam: a method for stochastic optimization. Proceedings of 2015 International Conference on Learning Representations, San Diego, CA. [cs]. Available from http://arxiv.org/abs/1412.6980 [accessed 15 March 2022].
Lopes D.J., Burgreen G.W., Entsminger E.D. 2020. North American hardwoods identification using machine-learning. Forests, 11(3): 298.
Mahdavi A., Carvalho M. 2021. A survey on open set recognition. [cs]. Available from http://arxiv.org/abs/2109.00893 [accessed 15 March 2022].
Martins J., Oliveira L.S., Nisgoski S., Sabourin R. 2013. A database for automatic classification of forest species. Mach. Vis. Appl. 24(3): 567–578.
Miller R., Wiedenhoeft A., Ribeyron M-J. 2002. CITES identification guide — tropical woods. Environment Canada, Canada.
Pan S.J., Yang Q. 2010. A survey on transfer learning. IEEE Trans. Knowl. Data Eng. 22(10): 1345–1359.
Panshin A.J., de Zeeuw C. 1980. Textbook of Wood Technology: Structure, identification, properties, and uses of the commercial woods of the United States and Canada. 4th ed. McGraw-Hill Series in Forest Resources. New York, McGraw-Hill Book Co.
Paszke A., Gross S., Massa F., Lerer A., Bradbury J., Chanan G., 2019. PyTorch: An Imperative Style, High-Performance Deep Learning Library. [cs, stat]. Available from http://arxiv.org/abs/1912.01703 [accessed 15 March 2022].
Pedregosa F., Varoquaux G., Gramfort A., Michel V., Thirion B., Grisel O., 2011. Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12(85): 2825–2830.
Ravindran P., Wiedenhoeft A.C. 2020. Comparison of two forensic wood identification technologies for ten Meliaceae woods: computer vision versus mass spectrometry. Wood Sci. Technol. 54(5): 1139–1150.
Ravindran P., Wiedenhoeft A.C. 2022. Caveat emptor: on the need for baseline quality standards in computer vision wood identification. Forests, 13(4): 632.
Ravindran P., Costa A., Soares R., Wiedenhoeft A.C. 2018. Classification of CITES-listed and other neotropical Meliaceae wood images using convolutional neural networks. Plant Methods, 14(1): 25.
Ravindran P., Ebanyenle E., Ebeheakey A.A., Abban K.B., Lambog O., Soares R., 2019. Image Based Identification of Ghanaian Timbers Using the XyloTron: Opportunities, Risks and Challenges. [cs]. Available from http://arxiv.org/abs/1912.00296 [accessed 15 March 2022].
Ravindran P., Thompson B.J., Soares R.K., Wiedenhoeft A.C. 2020. The xylotron: flexible, open-source, image-based macroscopic field identification of wood products. Front. Plant Sci. 11: 1015.
Ravindran P., Owens F.C., Wade A.C., Vega P., Montenegro R., Shmulsky R., 2021. Field-deployable computer vision wood identification of Peruvian Timbers. Front. Plant Sci. 12: 647515.
Ravindran P., Owens F.C., Wade A.C., Shmulsky R., Wiedenhoeft A.C. 2022. Towards sustainable North American wood product value chains, part I: computer vision identification of diffuse porous hardwoods. Front. Plant Sci. 12: 758455.
Recht B., Roelofs R., Schmidt L., Shankar V. 2019. Do ImageNet Classifiers Generalize to ImageNet? New York: Cornell University. https://doi.org/10.48550/arXiv.1902.10811.
Ruffinatto F., Crivellaro A., Wiedenhoeft A.C. 2015. Review of macroscopic features for hardwood and softwood identification and a proposal for a new character list. IAWA J. 36(2): 208–241.
Russakovsky O., Deng J., Su H., Krause J., Satheesh S., Ma S., 2015. ImageNet large scale visual recognition challenge. Int. J. Comput. Vis. 115(3): 211–252.
Scheirer W.J., de Rezende Rocha A., Sapkota A., Boult T.E. 2013. Toward open set recognition. IEEE Trans. Pattern Anal. Mach. Intell. 35(7): 1757–1772.
Smith L.N. 2018. A disciplined approach to neural network hyper-parameters: part 1—learning rate, batch size, momentum, and weight decay. [cs, stat]. Available from http://arxiv.org/abs/1803.09820 [accessed 15 March 2022].
Souza D.V., Santos J.X., Vieira H.C., Naide T.L., Nisgoski S., Oliveira L.E.S. 2020. An automatic recognition system of Brazilian flora species based on textural features of macroscopic images of wood. Wood Sci. Technol. 54(4): 1065–1090.
Srivastava N., Hinton G., Krizhevsky A., Sutskever I., Salakhutdinov R. 2014. Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15(56): 1929–1958. http://jmlr.org/papers/v15/srivastava14a.html [accessed 05/13/2022]
Vaze S., Han K., Vedaldi A., Zisserman A. 2021. Open-Set recognition: a good closed-set classifier is All You Need. [cs]. Available from http://arxiv.org/abs/2110.06207 [accessed 15 March 2022].
Wheeler E.A., LaPasha C.A., Miller R.B. 1989. Wood anatomy of elm (Ulmus) and hackberry (Celtis) species native to the United States. IAWA J. 10(1): 5–26.
Wiedenhoeft A.C. 2020. The xylophone: toward democratizing access to high-quality macroscopic imaging for wood and other substrates. IAWA J. 41(4): 699–719.
Wiedenhoeft A.C., Simeone J., Smith A., Parker-Forney M., Soares R., Fishman A. 2019. Fraud and misrepresentation in retail forest products exceeds U.S. forensic wood science capacity. PLoS One, 14(7): e0219917.
Wu F., Gazo R., Haviarova E., Benes B. 2021. Wood identification based on longitudinal section images by using deep learning. Wood Sci. Technol. 55(2): 553–563.
Yang J., Zhou K., Li Y., Liu Z. 2021. Generalized out-of-distribution detection: a survey. [cs]. Available from http://arxiv.org/abs/2110.11334 [accessed 15 March 2022].
Zech J.R., Badgeley M.A., Liu M., Costa A.B., Titano J.J., Oermann E.K. 2018. Variable generalization performance of a deep learning model to detect pneumonia in chest radiographs: a cross-sectional study. PLoS Med. 15(11): e1002683.

Supplementary material

Supplementary Material 1 (PDF / 71.4 KB).
Supplementary Material 2 (PDF / 123 KB).

Information & Authors

Information

Published In

cover image Canadian Journal of Forest Research
Canadian Journal of Forest Research
Volume 52Number 7July 2022
Pages: 1014 - 1027

Article versions

History

Received: 21 March 2022
Accepted: 19 May 2022
Accepted manuscript online: 25 May 2022
Version of record online: 4 August 2022

Data Availability Statement

A minimal dataset can be obtained by contacting the corresponding author, but the full training dataset used in the study is protected for up to 5 years by a CRADA between FPL, UW-Madison, and FSC.

Key Words

  1. wood identification
  2. illegal logging and timber trade
  3. XyloTron
  4. computer vision
  5. machine learning
  6. deep learning
  7. diffuse porous hardwoods
  8. ring-porous hardwoods
  9. sustainable wood products

Mots-clés

  1. identification des essences de bois
  2. commerce illégal du bois et des rondins
  3. XyloTron
  4. vision informatique
  5. apprentissage machine
  6. apprentissage profond
  7. feuillus poreux diffus
  8. feuillus aux anneaux poreux
  9. produits ligneux durables

Authors

Affiliations

Department of Botany, University of Wisconsin-Madison, 430 Lincoln Drive, Madison, WI 53706, USA
Center for Wood Anatomy Research, USDA Forest Service Products Laboratory, 1 Gifford Pinchot Drive, Madison, WI 53726, USA
Adam C. Wade
Department of Sustainable Bioproducts, Mississippi State University, 201 Locksley Way, Starkville, MS 39759, USA
Department of Sustainable Bioproducts, Mississippi State University, 201 Locksley Way, Starkville, MS 39759, USA
Rubin Shmulsky
Department of Sustainable Bioproducts, Mississippi State University, 201 Locksley Way, Starkville, MS 39759, USA
Department of Botany, University of Wisconsin-Madison, 430 Lincoln Drive, Madison, WI 53706, USA
Center for Wood Anatomy Research, USDA Forest Service Products Laboratory, 1 Gifford Pinchot Drive, Madison, WI 53726, USA
Department of Sustainable Bioproducts, Mississippi State University, 201 Locksley Way, Starkville, MS 39759, USA
Department of Forestry and Natural Resources, Purdue University, West Lafayette, IN, USA
Departamento de Ciências Biolôgicas (Botânica), Universidade Estadual Paulista – Botucatu, São Paulo, Brasil

Author Contributions

FO and RS provided access to and supervised data acquisition from the PACw and MSUtw test specimens. AW prepared and imaged the PACw and MSUtw specimens and prepared the initial draft of the paper. AW, FO, and ACW established the wood anatomical scope of the study. PR implemented the machine-learning pipelines for the study. PR and ACW conducted data analysis and synthesis. PR, ACW, AW, and FO wrote the paper. All the authors contributed actionable feedback that improved the presentation of the paper. AW: Adam Wade, ACW: Alex C. Wiedenhoeft.

Competing Interests

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Funding Information

This work was supported in part by a grant from the US Department of State via Interagency Agreement number 19318814Y0010 to ACW and in part by research funding from the Forest Stewardship Council to ACW. PR was partially supported by a Wisconsin Idea Baldwin Grant. The authors wish to acknowledge the support of US Department of Agriculture (USDA), Research, Education, and Economics (REE), Agriculture Research Service (ARS), Administrative and Financial Management (AFM), Financial Management and Accounting Division (FMAD) Grants and Agreements Management Branch (GAMB), under Agreement No. 58-0204-9-164, specifically for support of AW, FO, and RS. Any opinions, findings, conclusion, or recommendations expressed in this publication are those of the author(s) and do not necessarily reflect the view of the US Department of Agriculture.

Metrics & Citations

Metrics

Other Metrics

Citations

Cite As

Export Citations

If you have the appropriate software installed, you can download article citation data to the citation manager of your choice. Simply select your manager software from the list below and click Download.

Cited by

1. Production of High Tensile Strength Bio‐Based Carbon Fibers: Advances, Challenges, and Emerging Applications
2. Pores Segmentation Based on Active Contour Model for Automatic Wood Species Identification
3. The Macroscopic Structure of Wood

View Options

View options

PDF

View PDF

Login options

Check if you access through your login credentials or your institution to get full access on this article.

Subscribe

Click on the button below to subscribe to Canadian Journal of Forest Research

Purchase options

Purchase this article to get full access to it.

Restore your content access

Enter your email address to restore your content access:

Note: This functionality works only for purchases done as a guest. If you already have an account, log in to access the content to which you are entitled.

Figures

Tables

Media

Share Options

Share

Share the article link

Share on social media

Cookies Notification

We use cookies to improve your website experience. To learn about our use of cookies and how you can manage your cookie settings, please see our Cookie Policy.
×