Learning, Transferring and Adapting Distance Measures for Quantitative Structure-Activity Relationships
Tuesday 7 July 2009 at 12.30 PM by Webmaster
séminaire AAPN (Stefan Kramer)
Tuesday 7 July 2009 at 12.30 PM
Location: LIPN, B311, LIPN — Duration: one hour
Quantitative structure-activity relationships (QSARs) are regression models relating chemical structure to biological activity, allowing to make predictions for toxicologically or pharmacologically relevant endpoints, which constitute the target outcomes of trials or experiments. The task is often tackled by instance-based methods (like k-Nearest
Neighbors), which are all based on the notion of chemical (dis-)similarity. Clearly, it would be desirable to determine for a given QSAR dataset, a priori, a suitable distance measure. Our starting point is the observation by Raymond and Willett that the two big families of chemical distance measures, finger-print based and maximum common subgaph based measures, provide orthogonal information about chemical (dis-)similarity. We define a simple new distance measure weighting representatives of the two families, propose an optimization scheme for learning optimal weights for those measures combined, and investigate the transfer and adaptation of the weights from one problem to another, related problem with a similar or identical endpoint. Our experiments suggest that learning distance measures for QSAR (here formally defined as regression on molecular graphs) is feasible, and that the success of transferring and adapting such distance measures depends, amongst others, on training set size.