I am using the Jchem tool for excell to predict pka and logD for fluorinated compounds. Generally a lot of QSAR models have very little of these compounds in their training data and therefore the chemicals are often out of domain. I was wondering if there is any in-built domain assessment for pka (acidic and basic) and for logD models. Or if I could find out how many fluorinated compounds (and the extent of fluorination) there are in the training datasets for these models. I can't seem to find anything about the domain in the documentation.
There are none. We don't have fluorinated compounds in our training set, mainly because the main domain of our pKa and logP/D predictors (small and middle-size drug molecules) typically doesn't contain such molecules. Fluorinated compounds are usually not measured for pKa or logP/D.
In general our pKa predictor uses an ab-initio model for pKa prediction and doesn't have a training set. Our logP model, however, uses training sets. You can read more about them here: