Understanding linguistic Markush expressions in chemical patents
This paper explores OntoChem’s advancements in processing Markush claims in chemical patents, addressing challenges in extracting and interpreting complex chemical expressions. The OC|processor system leverages dictionary-based entity recognition, rule combiners, and syntax pattern matching to identify chemical compounds and their properties within patent texts. Special emphasis is placed on decoding Markush-style expressions found in chemical compositions, alloys, and polymers, as well as capturing conditional data and handling inconsistencies across different patent formats. This approach enhances chemical data accessibility for patent analysis, enabling precise retrieval of Markush structures critical for R&D in chemistry and materials science.