Adsa Logo White Adsa Title White

Performance of the machine learning method XGBoost for prediction of clinical health disorders in lactating dairy cows.

M. M. Pérez




Performance of the machine learning method XGBoost for prediction of clinical health disorders in lactating dairy cows.
M. M. Pérez*1, Y. You2, Y. Wang2, K. Q. Weinberger2, D. V. Nydam3, J. O. Giordano1. 1Department of Animal Science, Cornell University Ithaca, NY, 2Department of Computer Science, Cornell University Ithaca, NY, 3Department of Population Medicine and Diagnostic Sciences, Cornell University Ithaca, NY.

Our objective was to evaluate the ability of the machine learning method XGBoost to predict the occurrence of different clinical health disorders experienced by lactating dairy cows in early lactation using multiple sensor and non-sensor data. The clinical health status of lactating Holstein cows (n = 1,211) was determined based on daily clinical examination from 1 to 30 DIM. Clinical conditions recorded were: metritis, mastitis, ketosis, indigestion, and displaced abomasum. Cows were considered to have a clinical disorder for all days at which any of these conditions were recorded. Sensor data offered to ML models were: physical activity, resting behavior, reticulo-rumen temperature, rumination, eating behavior, and environmental temperature and humidity from −21 to 30 DIM. After calving data was also available for BW and daily milk volume, conductivity, and components (fat, protein, lactose). Non-sensor data used were: previous health and reproductive events, production records, and stocking density. For each individual (metritis, mastitis) or group of disorders (Met-Dig: ketosis, displaced abomasum, indigestion) of interest, an XGBoost model was developed using 80% of the data for training and 20% for testing. Performance metrics for models for disorders of interest were estimated (Table 1). Some models (mastitis and Met-Dig) tended to overfit the training data and were not able to generalize to the testing data, likely due to the limited training outcomes and unbalanced ratio of positive to negative outcomes. Machine learning models created using XGBoost had a different performance for predicting different health disorders when offered multiple cow behavioral, physiological, and performance sensor parameters, environmental sensor data, and health, reproductive, and performance records.Table 1. Performance metrics (sensitivity, Se; specificity, Sp) for models of disorders of interest

DisorderTrain setTest set

Keywords: machine-learning, disease, dairy cow.