Pairwise Comparison Aggregation with Tool-Augmented LLMs for Procurement Value Lever Prioritization

Main Article Content

Vitalii Shevchuk
Oleksandr Kondratiuk
Christoph Flöthmann

Abstract

Traditional prioritization of procurement Value Levers (VLs) relies on manual expert evaluation across multiple strategic dimensions, making the process time-intensive, resource-demanding, and vulnerable to subjective inconsistency. This study investigates the use of Large Language Models (LLMs) to semi-automate VL ranking through a pairwise comparison aggregation framework enriched with established strategic tools. The proposed methodology integrates LLM-based contextual reasoning with the Kraljic Matrix, Procurement with Purpose, and SWOT analysis to produce consistent, context-aware rankings across a portfolio of 15 VLs. A systematic experimental design evaluates two model architectures (Llama-3.2-3B-Instruct and Llama-3.1-8B-Instruct), alternative prompt representations (numeric vs. categorical), varying contextual depth, and multiple aggregation mechanisms. Results show moderate alignment with proprietary baseline rankings (weighted Kendall’s τ = 0.38–0.45), with performance improving through prompt calibration and strategic tool integration. Categorical representations consistently outperform numeric formats, indicating that qualitative descriptors better match LLM internal reasoning in procurement contexts. Strategic tool integration reduces ranking position variance by 84% and increases stability sixfold, while maintaining acceptable agreement with domain experts (68% exact agreement; Cohen’s κ ₍quadratic₎ = 0.62). Position-specific variance is observed for a small subset of VLs (13–20%), which may reflect either model-induced structural effects or the capture of widely accepted procurement best practices. Overall, the findings demonstrate that LLM-based VL ranking is operationally viable for initial screening and sensitivity analysis, substantially reducing manual effort while preserving decision quality.

Article Details

Section

Regular Paper

How to Cite

Pairwise Comparison Aggregation with Tool-Augmented LLMs for Procurement Value Lever Prioritization. (2026). International Journal of Management and Data Analytics (IJMADA), 6(1), 88-131. https://ijmada.com/index.php/ijmada/article/view/116

References

Singh, P. K., & Chan, S. W. (2022). The Impact of Electronic Procurement Adoption on Green Procurement towards Sustainable Supply Chain Performance-Evidence from Malaysian ISO Organizations. Journal of Open Innovation: Technology, Market, and Complexity, 8(2). https://doi.org/10.3390/joitmc8020061

Gurgun, A. P., Kunkcu, H., Koc, K., Arditi, D., & Atabay, S. (2024). Challenges in the Integration of E-Procurement Procedures into Construction Supply Chains. Buildings, 14(3), 605. https://doi.org/10.3390/buildings14030605

Cui, R., Li, M., & Zhang, S. (2020). AI and Procurement. Manufacturing & Service Operations Management. https://doi.org/10.2139/ssrn.3570967

Moenks, N., Penava, P., & Buettner, R. (2025). A Systematic Literature Review of Large Language Model Applications in Industry. IEEE Access, 13. https://doi.org/10.1109/ACCESS.2025.3608650

Chan, A. P. C., & Owusu, E. K. (2022). Evolution of Electronic Procurement: Contemporary Review of Adoption and Implementation Strategies. Buildings, 12(2), 198. https://doi.org/10.3390/buildings12020198

Afolabi, A., Ibem, E., Aduwo, E., Tunji-Olayeni, P., & Oluwunmi, O. (2019). Critical Success Factors (CSFs) for e-Procurement Adoption in the Nigerian Construction Industry. Buildings, 9(2), 47. https://doi.org/10.3390/buildings9020047

Ye, Y., Zhang, Z., Ma, T., Wang, Z., Li, Y., Hou, S., Sun, W., Shi, K., Ma, Y., Song, W., Abbasi, A., Cheng, Y., Cleland-Huang, J., Corcelli, S., Culligan, P., Goulding, R., Hu, M., Hua, T., Lalor, J., Liu, F., Luo, T., Maginn, E., Moniz, N., Rohr, J., Savoie, B., Slate, D., Stapleford, T., Webber, M., Wiest, O., Zhang, J., & Chawla, N. V. (2025). LLMs4All: A review on large language models for research and applications in academic disciplines. arXiv. https://doi.org/10.48550/arXiv.2509.19580

Pesch, P. J., Hofmann, H. C. H., & Pflücke, F. (2025). Potentials and challenges of large language models (LLMs) in the context of administrative decision-making. European Journal of Risk Regulation, 16, 76–95. https://doi.org/10.1017/err.2024.99

Aboelazm, K. S., & Dganni, K. M. (2025). Public procurement contracts futurity: Using of artificial intelligence in a tender process. Corporate Law & Governance Review, 7(1), 60-72. https://doi.org/10.22495/clgrv7i1p6

Andhov, M., Darnall, N., & Andhov, A. (2025). Leveraging AI for sustainable public procurement: opportunities and challenges. Frontiers in Sustainability, 6, 1603214. https://doi.org/10.3389/frsus.2025.1603214

Anghel, C., Anghel, A. A., Pecheanu, E., Cocu, A., Istrate, A., & Andrei, C. A. (2025). Diagnosing Bias and Instability in LLM Evaluation: A Scalable Pairwise Meta-Evaluator. Information, 16(8), 652. https://doi.org/10.3390/info16080652

Gu, J., Jiang, X., Shi, Z., Tan, H., Zhai, X., Xu, C., Li, W., Shen, Y., Ma, S., Liu, H., Wang, S., Zhang, K., Wang, Y., Gao, W., Ni, L., & Guo, J. (2025). A Survey on LLM-as-a-Judge. arXiv. https://doi.org/10.48550/arXiv.2411.15594

Abdelkarim, S., Lu, D., Flores, D., Jaeggi, S., & Baldi, P. (2025). Evaluating the Intelligence of large language models: A comparative study using verbal and visual IQ tests. Computers in Human Behavior: Artificial Humans, 5, 100170. https://doi.org/10.1016/j.chbah.2025.100170

Baeza-Yates, R., & Ribeiro-Neto, B. (2011). Modern Information Retrieval: The Concepts and Technology behind Search (2nd ed.). Addison-Wesley.

Ricci, F., Rokach, L., & Shapira, B. (2011). Introduction to Recommender Systems Handbook. Springer. https://doi.org/10.1007/978-0-387-85820-3_1

Yilmaz, E., & Aslam, J. A. (2008). Estimating average precision when judgments are incomplete. Knowledge and Information Systems, 16, 173–211. https://doi.org/10.1007/s10115-007-0101-7

Webber, W., Moffat, A., & Zobel, J. (2010). A similarity measure for indefinite rankings. ACM Transactions on Information Systems, 28(4). https://doi.org/10.1145/1852102.1852106

Similar Articles

You may also start an advanced similarity search for this article.

Most read articles by the same author(s)