I saw this blog post by Egon Willighagen a while back and bookmarked it to take a look at later. He discusses a relatively new article (by Zhao et al) which describes a molecular descriptor (named VABC) that predicts the van der Waals volume of molecule from just its atoms and bonds (e.g. from just its SMILES string). The title is "Fast Calculation of van der Waals Volume as a Sum of Atomic and Bond Contributions and Its Application to Drug Compounds" and can be found here.
Last week, while sitting on an airplane back from vacation, I hacked up an implementation of the descriptor for chemkit. You can see the commit on GitHub here. The implementation was fairly straightforward and I added a test suite using the data from the spreadsheet in the paper's supplementary information.
The simplest way to use it is via the Molecule::descriptor() method like so:
float value = molecule.descriptor("vabc").toFloat();
Chemkit is in an interesting place by being a toolkit that implements both an analytical van der Waals volume calculator (which uses the molecule's 3D coordinates) and this new predictive volume descriptor. When I find some time, I will make a post comparing the relative speed and accuracy of the two methods. Stay tuned!
Edit: I've posted a follow-up with an analysis of the accuracy and performance of the VABC descriptor here: http://kylelutz.blogspot.com/2012/01/vabc-molecule-descriptor-added-to.html