The research is conceived of as a contribution towards improving statisticians’ awareness of the language they use in communicating data. It raises questions on how to better address the international public and offers a picture of the state of the art. It is an attempt to describe features and patterns of the discourse of statistics. The research is organised into two parts: the main part is a corpus-based analysis of statistical texts; this is followed by an overview on the relationship European statisticians have with the texts that they produce and some possible suggestions on how to improve clarity and accessibility of statistical texts in English. With the aim of investigating the language of statistics an English comparable corpus ECC (Baker 1995: 234) was designed. It includes both translated and non-translated English comparable texts. The methodology adopted for the corpus design was provided by Bowker and Pearson’s (2002) Working with Specialized Language - A practical guide to using corpora. The entire .pdf file of each yearbook was downloaded. In some cases, such as for the Czech Republic, each chapter had to be downloaded separately, since each one was a different pdf. In the case of Estonia, Italy, Lithuania, Portugal and Slovenia the yearbooks are published in English with facing-page translation. All yearbooks have tables, figures and footnotes. After completing the downloading, all files were saved in .txt format. In the new files all tables, figures and footnotes were deleted. Footnotes typically indicate the source of data and therefore were not interesting for the aim of the study. Also the Contents pages were deleted as well as all parts of the text written in the national language in the case of facing-page translation. Each file was named with its national domain and the Yearbook year (i.e. IT2009, PT2008, ND2007, etc.). The whole corpus was named ENSY (European National Statistical Yearbooks). The time-span covered by the research is 2005-2010. For each country the last three yearbooks published in English and available online were included. The corpus consists of the English translation of European National Statistical Yearbooks from 13 European countries and Eurostat. Since the study is focused on statistical translated language, all the above-mentioned files were grouped in a subcorpus named Transtat (Translated statistical texts). Then following the same procedures, four yearbooks from Ireland (2007, 2008, 2009, 2010) were collected and grouped in the Natstat (Native English statistical texts) subcorpus. UK yearbooks were excluded because the last UK National Statistical Yearbook was published in 2005 and has a very different structure providing a very long description of the country life and policy but few data. The new yearbooks published by UK Statistics are at regional level and do not cover all the topics included in other National Statistical Yearbooks. Another sub-corpus was created for Eurostat yearbooks (2006/7, 2008, 2009, 2010) and was named Eustat (EU statistical texts).
Disseminating statistics in Europe: the Language perspective / Polese, Vanda. - (2013).
Disseminating statistics in Europe: the Language perspective
POLESE, VANDA
2013
Abstract
The research is conceived of as a contribution towards improving statisticians’ awareness of the language they use in communicating data. It raises questions on how to better address the international public and offers a picture of the state of the art. It is an attempt to describe features and patterns of the discourse of statistics. The research is organised into two parts: the main part is a corpus-based analysis of statistical texts; this is followed by an overview on the relationship European statisticians have with the texts that they produce and some possible suggestions on how to improve clarity and accessibility of statistical texts in English. With the aim of investigating the language of statistics an English comparable corpus ECC (Baker 1995: 234) was designed. It includes both translated and non-translated English comparable texts. The methodology adopted for the corpus design was provided by Bowker and Pearson’s (2002) Working with Specialized Language - A practical guide to using corpora. The entire .pdf file of each yearbook was downloaded. In some cases, such as for the Czech Republic, each chapter had to be downloaded separately, since each one was a different pdf. In the case of Estonia, Italy, Lithuania, Portugal and Slovenia the yearbooks are published in English with facing-page translation. All yearbooks have tables, figures and footnotes. After completing the downloading, all files were saved in .txt format. In the new files all tables, figures and footnotes were deleted. Footnotes typically indicate the source of data and therefore were not interesting for the aim of the study. Also the Contents pages were deleted as well as all parts of the text written in the national language in the case of facing-page translation. Each file was named with its national domain and the Yearbook year (i.e. IT2009, PT2008, ND2007, etc.). The whole corpus was named ENSY (European National Statistical Yearbooks). The time-span covered by the research is 2005-2010. For each country the last three yearbooks published in English and available online were included. The corpus consists of the English translation of European National Statistical Yearbooks from 13 European countries and Eurostat. Since the study is focused on statistical translated language, all the above-mentioned files were grouped in a subcorpus named Transtat (Translated statistical texts). Then following the same procedures, four yearbooks from Ireland (2007, 2008, 2009, 2010) were collected and grouped in the Natstat (Native English statistical texts) subcorpus. UK yearbooks were excluded because the last UK National Statistical Yearbook was published in 2005 and has a very different structure providing a very long description of the country life and policy but few data. The new yearbooks published by UK Statistics are at regional level and do not cover all the topics included in other National Statistical Yearbooks. Another sub-corpus was created for Eurostat yearbooks (2006/7, 2008, 2009, 2010) and was named Eustat (EU statistical texts).I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.