We present a multimedia summarizer system for retrieving relevant information from some web repositories based on the extraction of semantic descriptors of documents. In particular, semantics attached to each document textual sentences is expressed as a set of assertions in the ⟨subject,verb,object⟩ shape as in the RDF data model. While, images’ semantics is captured using a set of keywords derived from high level information such as the related title, description and tags. We leverage an unsupervised clustering algorithm exploiting the notion of semantic similarity and use the centroids of clusters to determine the most significant summary sentences. At the same time, several images are attached to each cluster on the base of keywords’ term frequency. Finally, several experiments are presented and discussed.
A Multimedia Summarizer Integrating Text and Images / D’Acierno, Antonio; Gargiulo, Francesco; Moscato, Vincenzo; Penta, Antonio; Persia, Fabio; Picariello, Antonio; Sansone, Carlo; Sperli', Giancarlo. - 40:(2015), pp. 21-33. (Intervento presentato al convegno 8th KES International Conference on Intelligent Interactive Multimedia Systems and Services, IIMSS-2015 tenutosi a ita nel 17-19 Giugno, 2015) [10.1007/978-3-319-19830-9_3].
A Multimedia Summarizer Integrating Text and Images
MOSCATO, VINCENZO;PICARIELLO, ANTONIO;SANSONE, CARLO;SPERLI', GIANCARLO
2015
Abstract
We present a multimedia summarizer system for retrieving relevant information from some web repositories based on the extraction of semantic descriptors of documents. In particular, semantics attached to each document textual sentences is expressed as a set of assertions in the ⟨subject,verb,object⟩ shape as in the RDF data model. While, images’ semantics is captured using a set of keywords derived from high level information such as the related title, description and tags. We leverage an unsupervised clustering algorithm exploiting the notion of semantic similarity and use the centroids of clusters to determine the most significant summary sentences. At the same time, several images are attached to each cluster on the base of keywords’ term frequency. Finally, several experiments are presented and discussed.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.