Case Study - Named Entity Recognition for Medical Texts
In one project, our employees managed to improve the performance of Named Entity Recognition for Swedish medical texts.
- Client
- Sahlgrenska Universitetssjukhuset
- Year
- Service
- Machine Learning
Overview
In that digital age, data has become an important asset, but in our Swedish healthcare system, there is an enormous amount of invaluable information locked in text form, in patient records. Transforming this unstructured information into useful data poses a major challenge.
But that's where a project comes in, carried out by two of our current employees, with the goal of efficiently extracting valuable data from Swedish patient records using Named Entity Recognition (NER).
In the work, several different BERT models and techniques for data augmentation (data augmentation) were investigated, which have the potential to significantly improve the results of NER on Swedish patient records.
The result showed that the data augmentation could significantly improve the performance of the system, especially when handling smaller data sets. Interestingly, we were also able to achieve comparable results by boosting 50% of the training data as using the entire original data set without boosting.
This project shows how the right technology and methods can help us extract valuable information from Swedish patient records, which can contribute to a more enlightened and data-driven healthcare.
What we did
- Machine Learning
- Data Augmentation
- Pytorch
- Python