Digital Economy Dispatch #154 -- From Scrolls to Tweets: Adapting to Data Abundance in a Digital Age
From Scrolls to Tweets: Adapting to Data Abundance in a Digital Age
22nd October 2023
I can’t keep up! It’s a complaint that I hear all the time as I talk with people. It seems that we’re being swamped with digital materials of every kind. It sometimes seems like all we do is to try to stay on top of the barrage of new content, whether it is accessing the wide variety of webpages and news apps we use each day, browsing posts on social media channels such as Facebook and X, or engaging with the never-ending stream of emails, blog posts, linkedin articles, and so on.
Of course, beyond the individual issues, dealing with so much data raises more fundamental concerns. For organizations with a history of working in situations of data scarcity, the implications of living in a world of data abundance can be overwhelming. The opportunities and challenges for data have morphed from being concerned primarily about locating and owning few data sources in narrow areas of interest toward the more difficult needs to determine the relevance, accuracy, value, and utility of extensive datasets that come from many sources, with varied provenance, and provided in diverse formats.
In practice, this data abundance opens new opportunities, but also raises new challenges. Organizations have gone from limited data but extensive governance and control, to much more data yet significantly less control over its utility. A different attitude to data is required to take best advantage of this shift. In a data-rich environment, digital leaders must refocus their attention away from the concerns of ensuring data availability as an input to reactive human tasks and toward the needs of establishing data provenance and quality to drive decision making and automate predictive actions.
A Digital Data Archive
Some time ago I remember discussing this shift with digital leaders at the UK’s National Archive in their facility at Kew on the outskirts of London. They are the official archive and publisher for the UK Government, responsible for over 1,000 years of national documents. It is fascinating to take a tour of the facility where they have all sorts of documents including linen scrolls signed by ancient kings and queens of England, government papers recording important decisions from the past centuries, and even got to see the handwritten notes of Alan Turning from his work at Bletchley Park.
While in the past these were physical documents of many kinds, today, of course, over 80 million of them have been digitized and almost all new materials they record are electronic records. Whether pdf’s of notes taken at a boring meeting or indiscreet tweets from an enraged government official, the National Archive is asked to record them for all for posterity. As a result, they have huge repositories of information, and it is growing at an alarming rate.
Their digital strategy offers a fascinating view into the range of concerns that we all must now address in a digital age: digital overload. What is interesting is that their perspective as a national archive offers a glimpse into the fundamental role that data plays in helping us to “to help understand the past, make sense of the present, and to guide us for the future”. In particular, I was struck by three distinct challenges that they address.
The first challenge that managing this archive brings is simply knowing what to save and what to ignore. In their case, only a relatively small proportion of what is created is appraised and selected for long-term preservation. Traditionally, this involves a lot of manual effort and expert judgment. It relies on a largely stable catalogue and traditional indexing approaches.
In a digital world this has to be adjusted significantly. So much new digital data is being created that only a small fraction is recorded in the archive. But more importantly, it is not as easy or obvious how to decide what is needed nor how to organize it. They are now asked to keep all sorts of content, from threaded discussions and online comments to video, websites, structured datasets and computer code. What is useful to save and what organizational structure should be used?
Second the National Archive needs to be able to make sense of the data they record at some point in the future. Often this can be many years later when the information is required to respond to some new query or emerging need. As digital data formats and standards evolve, concerns arise about how to ensure that the data they record has not been tampered with and can readily be used. So that the tools and code that is needed to read the data may also need to be recorded with the data. Consequently, the digital archive needs to understand and manage a complex set of dependencies. What will be required to make sense of recorded digital data now and in the future?
Third, the way people want to access and use recorded data has changed. In the past, there were very controlled circumstances in which a limited set of people requested specific data which (in the case of paper records) was fetched from the archive and accessed in person in a room at the National Archive. Every aspect of that process could be managed and controlled.
Now, of course, digital data is accessed in much more flexible ways by a growing number of people in widely different circumstances. Furthermore, users have high expectations of digital products. They expect services to be intuitive, for transactions to be simple, and for results to be immediate. This places a lot more pressure on data handling approaches and to invest in data access, security, and recovery procedures. How should the balance between keeping data safe and broadening access be addressed?
Three Data Management Lessons for Digital Leaders
The National Archive is an interesting example of how and why the role, importance, and use of data is evolving. As one of the UK’s foremost institutions in data management, it is useful to reflect on their particular approach to data management. From this we can identify 3 key lessons that are essential for all digital leaders in today’s world of data abundance:
Refocus Data Strategy: Digital leaders should reassess their organization's data strategy in a world of data abundance. Instead of just ensuring data availability, they should shift their focus toward establishing data provenance and quality. This means prioritizing data quality and reliability to drive decision-making and automate predictive actions.
Data Prioritization and Organization: In the face of overwhelming data abundance, digital leaders should implement strategies for better data prioritization and organization. They should invest in advanced data management techniques and technologies that help identify what data is valuable, relevant, and worth preserving. This may involve adopting new methods for classifying, indexing, and organizing data in diverse formats.
Adapt to Changing User Expectations: With the evolving ways people access and use recorded data, digital leaders should adapt to changing user expectations. This means investing in user-friendly interfaces, intuitive data access systems, robust security measures, and efficient data recovery procedures. Striking a balance between keeping data safe and broadening access is crucial to meet the demands of a diverse user base. Digital leaders should prioritize user experience and convenience while maintaining data security and integrity.