Extracting Relevant Chemical Information from Patents with Machine Learning

Posted on October 5th, 2020 by Xuanyan Xu in Chemistry

New chemical
compounds and reactions are often introduced to the world – and with little
fanfare – through patents. It may be years after the patent has been filed
before these compounds are published in scholarly journals, and even then it is
only a small share of them that are published at all. As a result, it can be
easy for these compounds to remain unknown to researchers who may be very
interested in them.

Text mining is
one potentially useful way of helping bring this important chemical information
to light, but unfortunately most text mining approaches don’t take the
relevancy of a compound in a patent into account. This means that too much
irrelevant data is extracted, therefore slowing down and complicating the
search process.

However, advanced
technologies like machine learning (ML) and natural language processing (NLP)
have enabled the development of models that can overcome this problem and
ensure the extraction of only the relevant compounds – thus making patent
resources much more helpful to researchers.

Saber Akhondi, a principal NLP scientist at Elsevier, will be diving into this topic in a webinar on October 7 titled Using machine learning to extract chemical information from patents. Among the subjects that he will be discussing are chemical information extraction, the unique challenges of patent mining in the chemical domain, and how to create a quality training set for machine learning in chemistry.

If you’d like to learn more and attend this webinar, register here.

R&D Solutions for Pharma & Life Sciences

We’re happy to discuss your needs and show you how Elsevier’s Solution can help.

Contact Sales

Source link

Scroll to Top