Delineation of Deep Anthracite Seam with Random Forest Predictor that Learnt Multi-geophysical and Drill-hole Datasets

A research team led by Kim Kang Sop, an institute head at the Faculty of Earth Science and Technology, has succeeded in his study on delineation of deep anthracite seams with random forest predictor that learnt multi-geophysical and drill-hole datasets.

Anthracite formed in the upper paleozoic era is a fundamental resource of fuel and raw material in our country. Thus, it is of great significance to improve the accuracy of a deep survey based on drillholes and geophysical data in the vicinity of existing anthracite-mines.

Several geophysical methods are available due to its low-resistivity, density, magnetic susceptibility, and high spontaneous-potential (SP). However, these methods have different limitations of penetration depth and resolution, and are critically affected by noises from powerlines, terrain, and geological complexity.

They conducted multiple-geophysical fieldworks at an anthracite mine, involving transient electromagnetic (TEM), SP, gravity, and magnetic prospectings to construct the database of existing drillholes. The study area in difficult terrain had four anthracite seams under the rough surface to the depth of about 800m. They showed irregular behavoir due to structural activities in the Mesozoic era. Fortunately, many holes which had already been drilled in the anthracite seams, could provide useful information to interpret the given geophysical datasets.

The problem was to build an appropriate strategy to constrain deep anthracite seams and evaluate the reserve by inverting multi-geophysical datasets along with the existing drill-hole database, throughout the study area including the parts with no drillholes.

They had some problems. First, drillhole data may be converted as a priori information for inverting geophysical data. However, seperated inversion may yield subsurface images of physical property with a remarkable difference from the real geology. Next, joint-inversion of multi-geophysical datasets may be another altenative. But this manner requires an elaborate code writing to incorporate realistic terrain and drillhole information.

Thus, they chose the random forest (RF) predictor which is recognized to be the most powerful for solving multiple-classfication problems. The RF is a prediction (classification) algorithm that has a classification tree as the elementary learner and is incorporated with the ensemble aggregation method. It shows higher generalization performance than other machine-learning algorithms.

On the basis of such preparations, they wrote an RF predictor. The inputs of RF predictor involved TEM, SP, gravity, and magnetic datasets and the outputs were designed to yield the upper depth and thickness of deep anthracite seams. The teaching signals were supplied based on the elevation and thickness of seams known from drillholes. The RF predictor, after learning the given teacher signals successfuly, yielded an objective evaluation of the study area including undrilled parts, and enabled them to predict new anthracite-rich regions and evaluate the reserve.