Large language models are increasingly used in settings where people search for, compare, and evaluate products, yet it is still unclear how closely their inferred preferences match those of consumers. This study examines how a state-of-the-art large language model, GPT, performs in structured food choice tasks and compares its behaviour with human participants. We replicated three discrete choice experiments on food products, concerning: 3D-printed chocolate, tomato sauce from social farming, and fresh tomatoes with environmental quality claims. Human and GPT-generated choices were analysed with the same discrete choice framework, using prompts that were either uninformed or enriched with sociodemographic and attitudinal profiles. Across all studies, GPT reproduced the qualitative structure of human preferences, correctly identifying influential attributes and the direction of effects with only a few exceptions. However, its estimated utilities were systematically more deterministic, requiring substantial probability rescaling to approximate human choice variability. GPT also struggled with attributes tied to cultural familiarity or sensory expectations. Personalization through profile information did not consistently improve alignment and sometimes increased distortions. When GPT-derived parameters were used as priors in design simulations, design efficiency improved relative to uninformed priors, suggesting value for early-stage experimental design. A comparison analysis across eight commercial models showed high inter-model consistency and strong correspondence with modal human choices. Overall, large language models can serve as useful synthetic shoppers for building and pretesting choice architectures, but limited calibration, contextual grounding, and personalization currently restrict their role as stand-alone surrogates for consumer behaviour in food choice contexts.

Do large language models shop like people? Comparing GPT and human food choices using discrete choice experiments / Califano, Giovanbattista; Caracciolo, Francesco. - In: FOOD QUALITY AND PREFERENCE. - ISSN 0950-3293. - 142:(2026). [10.1016/j.foodqual.2026.105930]

Do large language models shop like people? Comparing GPT and human food choices using discrete choice experiments

Califano, Giovanbattista;Caracciolo, Francesco
2026

Abstract

Large language models are increasingly used in settings where people search for, compare, and evaluate products, yet it is still unclear how closely their inferred preferences match those of consumers. This study examines how a state-of-the-art large language model, GPT, performs in structured food choice tasks and compares its behaviour with human participants. We replicated three discrete choice experiments on food products, concerning: 3D-printed chocolate, tomato sauce from social farming, and fresh tomatoes with environmental quality claims. Human and GPT-generated choices were analysed with the same discrete choice framework, using prompts that were either uninformed or enriched with sociodemographic and attitudinal profiles. Across all studies, GPT reproduced the qualitative structure of human preferences, correctly identifying influential attributes and the direction of effects with only a few exceptions. However, its estimated utilities were systematically more deterministic, requiring substantial probability rescaling to approximate human choice variability. GPT also struggled with attributes tied to cultural familiarity or sensory expectations. Personalization through profile information did not consistently improve alignment and sometimes increased distortions. When GPT-derived parameters were used as priors in design simulations, design efficiency improved relative to uninformed priors, suggesting value for early-stage experimental design. A comparison analysis across eight commercial models showed high inter-model consistency and strong correspondence with modal human choices. Overall, large language models can serve as useful synthetic shoppers for building and pretesting choice architectures, but limited calibration, contextual grounding, and personalization currently restrict their role as stand-alone surrogates for consumer behaviour in food choice contexts.
2026
Do large language models shop like people? Comparing GPT and human food choices using discrete choice experiments / Califano, Giovanbattista; Caracciolo, Francesco. - In: FOOD QUALITY AND PREFERENCE. - ISSN 0950-3293. - 142:(2026). [10.1016/j.foodqual.2026.105930]
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11588/1047011
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? ND
social impact