For many of us informatics nerds, 2020 may well be remembered as the year when the rest of the world finally began to appreciate the importance of informatics. While the average layperson tracking the progression of COVID-19 across the country and in their communities may not realize it, they’ve been studying analytics dashboards online since March.
In those various COVID-19 dashboards, we see the reflection of what’s occurring in labs around the world as we collectively fight the spread of SAR-CoV-2. We rely on the data to gain an understanding of how the pandemic impacts our lives, our families, our jobs and schooling, travel plans, and the communities in which we live.
But behind all those dashboards which rose to prominence this year, lies a bizarre irony. Compared to other industries, laboratories are way behind when it comes to digital transformation and analytics.
Walking into many labs, you will still find a heavy reliance on paper, Excel spreadsheets and other disparate computer systems – all of which are collecting information in a haphazard and often siloed manner. While information technology has raced beyond anything we could have imagined even twenty years ago, laboratory informatics has lagged.
The existence of LIMS, ELN, SDMS and other informatics has changed things – to some degree. The consolidation of this data is more structured and is now more capable of being used across different platforms. But are labs effectively using the data they collect?
The Data is Not Undiscovered. It’s Just Unused.
Consider the pharmaceutical industry.
Pharma is an industry in which the sheer amount of collected data offers an incredible potential to exploit it for competitive advantage in the discovery, development and marketing of products.
The reality of the industry – with some exceptions, of course – is much different. Most companies are only using this massive depository data for static report generation or trending analysis.
But what if this data could be unlocked for something more powerful, like predictive and prescriptive analytics to improve future planning of resources, or to remove bottlenecks and gain more efficiency?
Ensuring Lab Data Can be Used for Decision Making
Using this data would not only make your lab smarter but would also allow intelligent reuse outside the lab. To maximize informed decision making from the data most labs collect, it needs to be brought together from many different systems – LIMS, ERP, CRM, SDMS and other IT infrastructure. Depending on these systems and their interoperability, the data from these systems may be locked in silos. In some cases, it may even be unstructured, hindering its use…or even knowledge of its existence.
FAIR Data – Using a Common Structure
One of the biggest challenges with diverse datasets is that the data doesn’t always have the same identifiers and/or naming conventions. For example, the unique identifier for a sample might be the sample ID, the batch ID, the lot ID or one of many other such identifiers. And beyond proper collection, annotation, and archiving of data, it’s important that the data can be found and re-used for other purposes when needed.
To overcome these challenges, we need to make the data F.A.I.R:
- Findable: easy to identify and find (for both humans and computers!), with metadata that facilitates searching for specific datasets.
- Accessible: stored for long term so that it can easily be accessed and/or downloaded with well-defined access conditions, whether at the level of metadata, or actual data.
- Interoperable: ready to be combined with other datasets by humans or computers, without ambiguities in the meanings of terms and values.
- Reusable: ready to be used for future research and to be further processed using computational methods.
The FAIR principles have been embraced by both the European Commission and the G20.
An important step in the FAIR data approach is to publish existing and new datasets in a semantically interoperable format that can be understood by computers. (Used in this context, ‘semantics’ is the meaning or intent of a digital object.) By semantically annotating data items and metadata, we can use computer systems to (semi-) automatically combine different data sources, resulting in greater knowledge discovery. Machine-actionable data can also unlock the power of Machine Learning and Artificial Intelligence (AI).
From Data to AI
To make use of AI applications in a value-add manner, and also in a way where it can be deemed trustworthy, machine learning must interact seamlessly with the existing informatics landscape. This includes both hardware as well as software.
What role can AI play? Machine learning has carved out a proven niche in its use for anomaly detection and preventive maintenance. AI could take this to the next level, and open up entirely new opportunities for optimizing product design and increasing lab efficiency. It could also perform predictive planning, based on a wealth of both historical and current data.
According to Gartner’s Top 10 Trends in Data and Analytics for 2020, “by the end of 2024, 75% of enterprises will shift from piloting to operationalizing AI, driving a 5X increase in streaming data and analytics infrastructures.” Another trend Gartner points out: “by 2022, public cloud services will be essential for 90% of data and analytics innovation.”
The ability to put these types of tools to work is the objective of a lab’s digital transformation. When different systems are connected and sharing, labs reap a massive benefit. The more the systems share and the better they are integrated, the more a lab benefits – and the lower the risks. Digital transformation is about connectivity – not only between systems, but across global sites – and using data in multiple ways to benefit your business.
But digital transformation starts with raw data. By embracing the idea of reusing your data you are one step closer to the digital transformation of your laboratory.
Learn more about Digital Transformation at LabVantage.com or download our Digital Transformation white paper here.