Posts by admin_ontochem
OntoChem extracts U.S. Food and Drug Administration SPL files
The “Structured Product Labeling” (SPL) files of the United States FDA are a valuable public resource for drugs on the market. Thus, UNII (Unique Ingredient Identifier) numbers are assigned to each drug and its chemical structure information by the FDA registration system. UNII numbers are also used in several databases such as drug labels in…
Read MoreProcessing tables in documents and images
Probably most of the scientific information is captured in tables – for example in US patents from 2001-2017 we have extracted more than 10 million tables containing interesting properties on materials and compounds. At OntoChem we have developed several technologies to extract this knowledge over the last 5 years. These software modules may read different…
Read MoreSemantic homonym resolution – key to reduce the number of false positive search hits
Many words can have different meanings – also known as “homonyms”. Homonymic terms are often the cause for false positive search hits. How do we use semantic indexing to find what you intended to? Homonyms in different knowledge areas: Just take the term “sting” – it could mean a protein named Sting (stimulator of interferon…
Read MoreProcessing images to chemical structures
A lot of scientific information is captured in images – we are using machine learning techniques such as deep neural networks to classify images. For example, we have applied transfer learning to train a deep convolutional neural network for developing a ML classifier that detects if an image contains a chemical structure. If so, this…
Read More