How data science helps GSK with the digitisation of laboratories

The race to produce new drugs and vaccines has been putting the pharma sector under pressure for years. Pressure that has only massively increased since the current coronavirus crisis. Not even pharma giant GlaxoSmithKline (GSK) is immune. Could technology help the company accelerate the development of new vaccines? Could technology help automate and improve its lab processes? With the automation of one of GSK’s lab tests through machine learning, Ordina’s data scientists proved that the answer to these questions is a resounding yes. With impressive time savings and improved accuracy at that!

GSK, the fifth largest pharma company in the world, focuses on the development of vaccines in Belgium: with the GSK site in Wavre even being the biggest vaccine production site in the world. The development of a vaccine is an incredibly time-consuming process, which can take up to 15 years. Therefore, GSK wants to invest in automation in Wavre, among other things. Ordina’s excellent reputation in the life sciences sector prompted the pharma giant to come knocking on our door.

Automation of lab tests with flow cytometer

‘GSK enlisted our help with the automation of a lab test with the flow cytometer – a piece of equipment that measures and analyses the size, shape and physical and chemical characteristics of cells,’ Kimberly Hermans of our data science team explains.

Flow cytometry is a commonly used technique in testing drugs and vaccines. In short, it works like this: a sample of cells in a solution is placed in a flow cytometer and passed through a focused laser beam, after which detectors identify the cell characteristics and count and quantify the particles. After this test, the researchers select the most useful cells. ‘This selection – known as gating – happens in four steps, all of which are done manually,’ Kimberly continues. ‘We automated this process completely using data models and techniques such as clustering and anomaly detection.’

Faster, more accurate and more insights

The result? Whereas before a researcher needed 30 to 40 minutes per test, the test is now completed in 5 minutes, without the intervention of a scientist. What’s more, the results are more accurate compared to manual selection. And, quite a lot of extra documentation is provided: at every step of the gating process, statistics are generated automatically (e.g. about the number of discarded cells, detected clusters etc) which can also be visualised. ‘Thanks to these data visualisations, the researchers gain insights into the decisions the algorithm takes. This helps them improve and refine the gating procedures further,’ Kimberley explains.

The result? Whereas before a researcher needed 30 to 40 minutes per test, the test is now completed in 5 minutes, without the intervention of the researcher.

Further optimisation and refinement

The proof of concept Ordina put forward was very well received. In the future, Ordina and GSK will improve the gating algorithms further to allow for an even more accurate selection of cells and particles. The algorithm will also be refined further so that it can be used simultaneously on multiple nuclei. This will save even more time. Ordina also ensures that all data is stored securely in databases, making it easily accessible to anyone who wants to use it. Besides the gating algorithms, there are many other tests that can be automated through Data Science, Data Visualisation and Robotics.

From compliance to innovation

Ordina is primarily known in the pharma world for its expertise in compliance and validation. But more and more, this is being extended to other projects too. After all, we know the sector inside out. By combining this experience and expertise with our knowledge of IT and new technology such as data science, we are able to achieve great results,’ Kimberly concludes. ‘Data science offers enormous potential to the pharma sector and the healthcare sector full stop. By bundling data, looking for patterns in it and running predictive models on it, we can optimise and support processes and treatments radically, which benefits the sector, as well as the patients, of course.’

By bundling data, looking for patterns in it and running predictive models on it, we can optimise and support processes radically.