This Knowledge Synthesis project seeks to examine existing formal and informal literature around best practices for teaching data literacy at the post-secondary level: what data skills are required to be data literate?  How are these skills best taught across programs? What are the best practices that we’ve established after decades (and centuries) of teaching students to work with data in various forms.  We welcome any insight into this question; please send us comments, feedback, or pointers to related resources!

The results of our examination will be published here, at dataliteracy.ca, and disseminated through various venues.

The proposal overview, as submitted to SSHRC, is below!

The Proposal

We are a data-rich society; perhaps even data-driven (Pentland, 2013). In 2012, analysts estimated 90% of the world’s data had come into existence within the previous 2 years (Vesset et al., 2014). Organizations in all sectors are struggling with this volume of data, confident that despite the velocity at which it is growing, and the variety of its formats, there is value. The goal is to transition from being data-rich to being information-rich and knowledge-rich, for which we need both data scientists and people capable of working effectively with data. The McKinsey Global Institute suggested that at current training rates, in the US alone there will be 140,000-190,000 more jobs than trained data scientists by 2018 (Manyika et al., 2011). They also estimate a 1,500,000 employee shortfall of “data-savvy” analysts and managers capable of working with the data to make effective decisions; IDC suggests a similar number (Vesset et al., 2014). This latter set of skills, which we refer to as data literacy, has not been taught systematically in post-secondary education in Canada.

Across academic disciplines and throughout the private sector, we are recognizing a growing need for data-literate graduates from all backgrounds. The recent Tri-Council consultation document on digital scholarship (Government of Canada, 2013) recognizes this challenge, and the issue of training in particular: “Digital data are the raw materials of the knowledge economy, and are becoming increasingly important for all areas of society, including industry… The same may be said of the capacity to capture, manage and preserve it, or the requisite training of personnel who can operate effectively in this milieu” (Government of Canada, 2013 [4]). This recognition prompts our core question: How can post-secondary institutions in Canada best equip graduates with the knowledge, understanding, and skills required for the data-rich knowledge economy?

Data literacy is the ability to comprehend, create, and communicate data, and is the first level of the tri-level literacy, fluency, mastery scale. Data-literate individuals have the knowledge, understanding, and skills to connect people to data. Data literacy spans both qualitative and quantitative data, and is enabled by a broad range of data-related capabilities and learning outcomes, including but not limited to:

  • Data collection and grounding in sound methodology; creating data sets with appropriate metadata.
  • Data management—how to structure, store, preserve, harmonize, and enable sharing of raw data.
  • Data analysis—how to transform raw data into usable information and/or knowledge; incorporates the process of approaching an unfamiliar data set, understanding it, and identifying core features or anomalies; performing appropriate summations, aggregations, highlights, etc.; reaching appropriate conclusions & insights; and achieving relevant results.
  • Data visualization, and the honest, ethical, accurate, and compelling graphic representation of data.
  • Data policy, regarding privacy, security, retention, organization, openness, integrity, metadata, data models, open data, and sharing.
  • Data dissemination and sharing, metadata; how to make data open and interoperable.
  • Creation, maintenance, and use of metadata, including measures of data quality.
  • Evidence-based decision-making, and in general the effective and ethical use of data to inform policy-making, decisions, or even personal opinions.

Elements of data literacy are taught, explicitly or implicitly, across all disciplines and at all levels of post-secondary institutions. Faculty have substantial expertise in these areas and many students will graduate with some level of data literacy. Although the necessity for data literacy spans disciplines, best practices for teaching it do not. There are pockets of excellence in providing the knowledge, understanding, and skills each academic program has identified as important (such as data analysis in business or data collection in sociology), but there is no systematic approach to understanding how best to teach data literacy across programs, and no common standard for certifying data literacy. Finally, and perhaps most importantly, data literacy is not taught as a transferable skill; students learn how to work with data in their specialty, often in a research context, but are not cognizant of the broad applicability of such skills. These best practices from specific disciplines and for particular kinds of data literacy have not been merged into a transdisciplinary pedagogy.

Changes in technology have previously prompted updated definitions of literacy. For example, the emergence of the Internet as a global force in the early 2000s prompted a variety of efforts to redefine literacy. Some focused on what it meant to be literate when the text was an online source instead of print (e.g. Kinzer & Leander, 2003; Leu, Kinzer, Coiro, & Cammack, 2004; Smolin & Lawless, 2003); others asserted that new skills- and technology-based literacies would be required to navigate the 21st century (e.g. Eshet, 2002; Oxbrow, 1998); others posited that new ways of thinking about ideas, new forms of digital literacy, and new kinds of teaching and learning were required (e.g. Bawden, 2001). Each definition of evolving literacy is predicated on already-existing foundational literacies, and responds simultaneously to emergent technologies, social forces, and pedagogies (Leu et al., 2004).

Martinovic & Freiman, in a 2014 SSHRC Knowledge Synthesis report, concluded modern students still need explicit digital literacy training to be effective in the workforce, and that best practices for doing so are not well-studied or well-documented. The rapid emergence of a new technology must be effectively leveraged to compete globally and thus prompt public-policy initiatives to encourage new modes of education. Data literacy is a response to a similar evolution: the confluence of inexpensive storage, new data generators ranging from social media to the Internet of Things, the invisible accumulation of data, the “infinite” space of cloud storage, and the expectation that organizations make effective use of data to compete globally has resulted in the emerging challenge of Big Data. To better understand the relationship with earlier forms of literacy, we will examine and compare our synthesis with previous reviews and syntheses in other areas of literacy taught at the post-secondary level, including information literacy (e.g. Mounce, 2010), digital literacy (e.g. Mills, 2010; Martinovic & Freiman, 2014), and technical literacy (e.g. Penuel, 2006).

We propose a transdisciplinary examination of existing strategies and best practices for teaching data literacy, synthesizing documented explicit knowledge using a narrative-synthesis methodology (see Work Plan section for details; e.g. Grimshaw, 2010; Petticrew & Roberts, 2005; Popay et al., 2006) and identifying areas where additional research is needed. While some data literacy strategies and best practices are captured in formal, peer-reviewed academic literature, we expect others to be captured informally. Our information gathering process will be broad, including formal peer-reviewed literature, grey literature on data training, and informal knowledge sharing via blogs and social media. The transdisciplinary team will assess the quality of informally captured knowledge.

Our knowledge synthesis will be guided by the following sub-questions that will delineate the scope of our collective inquiry, which are subject to new iterations as our literature review proceeds:

  • What innovative and collective approaches to teaching data literacy are being developed by educational institutions, particularly universities, colleges and institutes, and what learning outcomes have been identified to date? (based on subtheme 1.b)
  • What is the Canadian education system at the post-secondary level doing to develop adequate and sustainable data literacy skills? (based on subtheme 1.d)
  • What roles do postsecondary institutions play in meeting society’s demand for data literate graduates? (based on subtheme 2.b)
  • How are new ways of learning and teaching fostering greater knowledge and competency in critical and analytical thinking, problem-solving, communication of complex ideas and data, data collection methodologies, data management, data policy, data sharing, and evidence-based decision-making? (based on subtheme 2.g)

 

References

Bawden, D. (2001). Information and digital literacies: a review of concepts. Journal of Documentation, 57(2), pp. 218–259.

Eshet, Y. (2002). Digital literacy: A new terminology framework and its application to the design of meaningful technology-based learning environments. In Proceedings of ED-MEDIA 2002 World Conference on Educational Multimedia, Hypermedia & Telecommunications. Association for the Advancement of Computing in Education.

Government of Canada. (2013). “Capitalizing on Big Data: Toward a Policy Framework for Advancing Digital Scholarship in Canada” [consultation document]. Available: http://www.sshrc-crsh.gc.ca/about-au_sujet/publications/digital_scholarship_consultation_e.pdf

Grimshaw, J. (2010). A Knowledge Synthesis Chapter. Canadian Institute of Health Research (CIHR), Available: http://www.cihr-irsc.gc.ca/e/41382.html.

Kinzer, C.K., & Leander, K. (2003). Technology and the language arts: Implications of an expanded definition of literacy. In J. Flood, D. Lapp, J.R. Squire, & J.M. Jensen (Eds.), Handbook of research on teaching the English language arts (2nd ed., pp. 546-566). Mahwah, NJ: Erlbaum.

Martinovic, D. and Freiman, V. (2014). Digital Skills Development for Future Needs of the Canadian Labour Market. 2014 SSHRC Knowledge Synthesis Grant Report.

Manyika, J., Chui, M., Brown, B., Bughin, J., Dobbs, R., Roxburgh, C., and Byers, A. H. (2011). Big data: The next frontier for innovation, competition, and productivity [report]. McKinsey Global Institute.

Mills, K. A. (2010). A review of the “digital turn” in the New Literacy Studies. Review of Educational Research, 80(2):246–271.

Mounce, M. (2010). Working together: Academic librarians and faculty collaborating to improve students’ information literacy skills: A literature review 2000–2009. The Reference Librarian, 51(4):300–320.

Oxbrow, N. (1998). Information literacy – the final key to an information society. Electronic Library, 16(6), pp. 359–360.

Pentland, A. (2013). The data-driven society. Scientific American, 309(4):78–83.

Penuel, W. R. (2006). Implementation and effects of one-to-one computing initiatives. Journal of Research on Technology in Education, 38(3):329–348.

Petticrew, M., Roberts, H. (2005). Systematic reviews in social sciences: a practical guide. Wiley Blackwell.

Popay, J., Roberts, H., Sowden, A., Petticrew, M., Arai, L., Rodgers, M., Britten, N., Roen, K., and Duffy, S. (2006). Guidance on the conduct of narrative synthesis in systematic reviews. A product from the ESRC methods programme. Lancaster: Institute of Health Research.

Smolin, L.I., & Lawless, K.A. (2003). Becoming literate in the technological age: New responsibilities and tools for teachers. The Reading Teacher, 56, 570-577.

Vesset, D., Olofson, C. W., Schubmehl, D., McDonough, B., Woodward, A., Stires, C., Fleming, M., Nad- karni, A., Zaidi, A., and Dialani, M. (2014). IDC FutureScape: Worldwide Big Data and Analytics 2015 Predictions [report]. International Data Corporation.