TY - JOUR
T1 - Machine Learning Analysis of Borehole Data for Geotechnical Insights
AU - Mitelman, Amichai
N1 - Publisher Copyright:
© 2024 by the author.
PY - 2024/12
Y1 - 2024/12
N2 - This paper explores the use of machine learning (ML) to analyze borehole data aiming to enhance geotechnical insights, using the Gaza Strip as a case study. The data set consists of 632 boreholes, with features including spatial coordinates, ground level, and soil type per depth. A random forest (RF) classification model was applied to predict soil types, achieving an accuracy of approximately 75%. Notably, the model retained this accuracy even when the data set size was reduced to 30%, suggesting predictable subsurface conditions over large areas. A comparative analysis of common misclassifications revealed that errors mostly occurred between similar soil types, indicating the model’s ability to capture meaningful geological patterns. Unsupervised learning using k-means clustering revealed no clear-cut boundaries between clusters, indicating localized geological anomalies despite large-scale predictability. These findings align with the demonstrated stability of the Gaza Tunnel Network (GTN), a vast network of tunnels which was constructed without comprehensive site investigations. This study demonstrates the potential of ML to improve geotechnical assessments and suggests that fewer boreholes may be needed for large-scale projects, offering cost-saving opportunities. For future research, it is recommended to integrate advanced ML tools, including large language models (LLMs) for analyzing qualitative data from borehole logs, and interpretability methods to enhance model explainability, thus enhancing geological understanding and increasing predictive power.
AB - This paper explores the use of machine learning (ML) to analyze borehole data aiming to enhance geotechnical insights, using the Gaza Strip as a case study. The data set consists of 632 boreholes, with features including spatial coordinates, ground level, and soil type per depth. A random forest (RF) classification model was applied to predict soil types, achieving an accuracy of approximately 75%. Notably, the model retained this accuracy even when the data set size was reduced to 30%, suggesting predictable subsurface conditions over large areas. A comparative analysis of common misclassifications revealed that errors mostly occurred between similar soil types, indicating the model’s ability to capture meaningful geological patterns. Unsupervised learning using k-means clustering revealed no clear-cut boundaries between clusters, indicating localized geological anomalies despite large-scale predictability. These findings align with the demonstrated stability of the Gaza Tunnel Network (GTN), a vast network of tunnels which was constructed without comprehensive site investigations. This study demonstrates the potential of ML to improve geotechnical assessments and suggests that fewer boreholes may be needed for large-scale projects, offering cost-saving opportunities. For future research, it is recommended to integrate advanced ML tools, including large language models (LLMs) for analyzing qualitative data from borehole logs, and interpretability methods to enhance model explainability, thus enhancing geological understanding and increasing predictive power.
KW - boreholes
KW - coastal geology
KW - machine learning
KW - site investigation
KW - tunnel stability
UR - http://www.scopus.com/inward/record.url?scp=85215411449&partnerID=8YFLogxK
U2 - 10.3390/geotechnics4040060
DO - 10.3390/geotechnics4040060
M3 - ???researchoutput.researchoutputtypes.contributiontojournal.article???
AN - SCOPUS:85215411449
SN - 2673-7094
VL - 4
SP - 1175
EP - 1188
JO - Geotechnics
JF - Geotechnics
IS - 4
ER -