OCMiner for Patents. Extracting Chemical Information from Patent Texts
OCMiner, a semantic text processing system developed by OntoChem, is designed to identify and extract chemical entities from complex patent texts. Leveraging high-performance dictionary lookup, name-to-structure conversion, and a specialized chemical ontology, OCMiner recognizes a range of chemical entities—including formulas, compound names, and abbreviations—in multiple languages. Participating in the BioCreative V CHEMDNER-Patents challenge, OCMiner achieved notable results, highlighting its effectiveness in precision and recall for chemical named entity recognition. Its ontology-based approach enables concept mapping, transforming recognized text into structured chemical knowledge that can be indexed and analyzed.