Classifying Compounds in Public Databases
OntoChem’s SODIAC system facilitates the automatic classification of chemical compounds in extensive public databases, such as ChEBI and PubChem, using a combination of ontology-based classification and SMARTS pattern recognition. By automating classification for millions of compounds, SODIAC enhances data consistency and accuracy, applying hierarchical rules and logical filters to identify specific compound classes. The system also addresses challenges like overlapping classifications and inconsistencies across databases. This approach supports efficient data mining and knowledge inference, enabling robust chemical entity organization and facilitating the search for compounds across scientific and patent databases.