Understanding of a complex reaction system at a fundamental level is crucial as it reduces the time and resources required for process development and implementation at scale. The two distinct paradigms in developing fundamental knowledge of a chemical system start from either experimental observations (data-driven modeling), or from mechanistic a priori knowledge (physical models). With the rise of automation and tremendous modern advancements in data science the two approaches are gradually merging, although model identification for multivariable complex systems remains challenging in practice. In this work, the identification of interpretable and generalizable physical models is targeted by means of automatable, data-driven methods without a priori knowledge. A revised mixed-integer nonlinear programming (MINLP) formulation is proposed for symbolic regression (SR) to identify physical models from noisy experimental data. The identification of interpretable and generalizable models was enabled by assessing model complexity and extrapolation capability. The method is demonstrated by successful application for the identification of a kinetic model of the 4-nitrophenyl acetate (PNPA) hydrolysis reaction.
Symbolic Regression for the Automated Physical Model Identification in Reaction Engineering / Cao, L.; Neumann, P.; Russo, D.; Vassiliadis, V. S.; Lapkin, A. A.. - (2019). (Intervento presentato al convegno AIChE Annual Meeting 2019).
Symbolic Regression for the Automated Physical Model Identification in Reaction Engineering
Russo D.;
2019
Abstract
Understanding of a complex reaction system at a fundamental level is crucial as it reduces the time and resources required for process development and implementation at scale. The two distinct paradigms in developing fundamental knowledge of a chemical system start from either experimental observations (data-driven modeling), or from mechanistic a priori knowledge (physical models). With the rise of automation and tremendous modern advancements in data science the two approaches are gradually merging, although model identification for multivariable complex systems remains challenging in practice. In this work, the identification of interpretable and generalizable physical models is targeted by means of automatable, data-driven methods without a priori knowledge. A revised mixed-integer nonlinear programming (MINLP) formulation is proposed for symbolic regression (SR) to identify physical models from noisy experimental data. The identification of interpretable and generalizable models was enabled by assessing model complexity and extrapolation capability. The method is demonstrated by successful application for the identification of a kinetic model of the 4-nitrophenyl acetate (PNPA) hydrolysis reaction.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.