Saturday, August 6, 2011

Qt dependency removed from libchemkit

I just pushed the final few commits needed to remove the Qt depencency from libchemkit (the core chemkit library). You can see the final commit in the series here.

The proof can be seen in ldd's output below:

$ ldd libchemkit.so
    linux-vdso.so.1 =>  (0x00007fffcf9ff000)
    libboost_system.so.1.42.0 => /usr/lib/libboost_system.so.1.42.0 (0x00007fccae6b9000)
    libboost_filesystem.so.1.42.0 => /usr/lib/libboost_filesystem.so.1.42.0 (0x00007fccae4a4000)
    libstdc++.so.6 => /usr/lib/x86_64-linux-gnu/libstdc++.so.6 (0x00007fccae19d000)
    libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007fccadf18000)
    libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007fccadd02000)
    libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007fccad96d000)
    librt.so.1 => /lib/x86_64-linux-gnu/librt.so.1 (0x00007fccad765000)
    libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007fccad547000)
    /lib64/ld-linux-x86-64.so.2 (0x00007fccaeb8a000)


Now libchemkit only depends on the C++ standard library, Boost, and Eigen3.

The next step is moving the remaining file format plugins that use QIODevice over to using STL's istream and ostream. You can see which plugin require this refactoring over in the chemkit bug tracker.

Tuesday, April 26, 2011

Chemkit relicensed under the BSD license

As of today the chemkit library is licensed under the 3-clause BSD license. The new license file can be seen here.

The BSD license is far simpler and far less restrictive than the previously used GNU Lesser General Public License (LGPL). This new license should allow for a much wider usage of the chemkit library.

Monday, April 4, 2011

Chemkit moves to CMake

Over this past weekend I ported the build system for chemkit from QMake to CMake. You can see the commit here (it's quite big). CMake is a great system and it is much more powerful and configurable than QMake was.

The impact for users is minimal and the only real change is that now to compile chemkit from source you need to use the following commands:

  cd /path/to/chemkit
  mkdir build
  cd build
  cmake ..
  make

CMake uses out of source builds which eliminates the clutter of object files in the source directories and moves them all under the build/ directory.

Another nice feature of CMake is its integration with CTest. This allows for all the chemkit auto-tests to be run with just a single command like so:

  cd /path/to/chemkit/build
  ctest

CTest replaces the old, custom autotest.py script and works quite well.

Overall I am pleased with CMake's performance. I haven't tested it on Windows yet but hopefully, due to its cross-platform nature, it will "Just Work". Let me know if you have any issues building chemkit with the new build system.

Wednesday, March 16, 2011

Chemkit's VF2 Algorithm in Substructure Benchmark

Asad Rahman posted a benchmark on his blog comparing chemkit's VF2 substructure isomorphism algorithm (which he ported to Java) with two others. As seen from the graph chemkit fares very well against the competition :-).

Tuesday, March 15, 2011

Molecular surface support in chemkit

Today I added support for visualizing molecular surfaces to chemkit. You can see the commit here. The new class is called GraphicsMolecularSurfaceItem and it was added to the chemkit-graphics library.

The rendering is fast due to the underlying O(n log n) alpha-shape based surface algorithm. There is still room for improvement but the new code works well.

Here is a screenshot of the new molecular surface item on top of a guanine molecule:

So far only Van Der Waals and solvent accessible surfaces are supported. Solvent excluded surfaces are coming soon. Stay tuned.

Wednesday, February 23, 2011

Optimizing the UFF force field by unrolling the pow() function

Today I added a new performance benchmark to the chemkit library. The new benchmark, called uridine-minimization, measures the time to perform an energy minimization until convergence on a uridine molecule using the UFF force field.

Running the benchmark gives:
2,985 msecs per iteration
After running the benchmark through a profiler (I used callgrind from the valgrind package) I found some pretty interesting results and a potential easy performance optimization.

Total time spent in the program

Time spent in functions called by UffVanDerWaalsCalculation::energy()
We see that 53% of the total time the program is in the UffVanDerWaalsCalculation::energy() method and, of that, 85% of the time is spent executing the pow()function. A potential optimization is to remove the pow()function call and replace it with a manual calculation of the two exponent terms.

The current code is:
chemkit::Float UffVanDerWaalsCalculation::energy() const
{
    const chemkit::ForceFieldAtom *a = atom(0);
    const chemkit::ForceFieldAtom *b = atom(1);

    chemkit::Float d = parameter(0);
    chemkit::Float x = parameter(1);
    chemkit::Float r = distance(a, b);

    return d * (-2 * pow(x/r, 6) + pow(x/r, 12));
}

And the new code becomes:
chemkit::Float UffVanDerWaalsCalculation::energy() const
{
    const chemkit::ForceFieldAtom *a = atom(0);
    const chemkit::ForceFieldAtom *b = atom(1);

    chemkit::Float d = parameter(0);
    chemkit::Float x = parameter(1);
    chemkit::Float r = distance(a, b);

    chemkit::Float xr = x / r;
    chemkit::Float xr2 = xr * xr;
    chemkit::Float xr4 = xr2 * xr2;
    chemkit::Float xr6 = xr4 * xr2;
    chemkit::Float xr12 = xr6 * xr6;

    return d * (-2 * xr6 + xr12);
}
After making this change the benchmark time fell dramatically:
1,803 msecs per iteration
The new code is 60% faster. A good improvement. But, now we had to make sure that the new method is just as accurate as the old method. To test the difference in the normal and unrolled functions I wrote the following code which calculates the relative error introduced and the speed of each version:
#include <cmath>
#include <ctime>
#include <limits>
#include <iostream>

// calculates the van der waals energy for the given distance 'r'
double energy(double r)
{
    // parameters for Helium
    double d = 0.056;
    double x = 2.362;
    
    // [Rappe 1992] equation 20
    return d * (-2 * pow(x/r, 6) + pow(x/r, 12));
}

// calculates the van der waals energy for the given distance 'r'
//
// unrolls the pow() function for increased performance
double energy_unrolled(double r)
{
    // parameters for Helium
    double d = 0.056;
    double x = 2.362;

    double xr = x / r;
    double xr2 = xr * xr;
    double xr4 = xr2 * xr2;
    double xr6 = xr4 * xr2;
    double xr12 = xr6 * xr6;
    
    // [Rappe 1992] equation 20
    return d * (-2 * xr6 + xr12);
}

// returns the time (in msecs) per iteration of 'function' called 
// with r from 0 to 100 in picometer steps
double benchmark(double (*function)(double), int iterations)
{
    time_t start = time(0);

    for(int i = 0; i < iterations; i++){
        for(double r = 1e-2; r < 100; r += 1e-2){
            volatile double e = function(r);
        }
    }

    return difftime(time(0), start) / iterations * 1000;
}

// measures the speed of the energy() and energy_unrolled() functions
void measure_speed()
{
    std::cout << "energy time: " << benchmark(energy, 1e5) << " msec" << std::endl;
    std::cout << "energy_unrolled time: " << benchmark(energy_unrolled, 1e5) << " msec" << std::endl;
}

// measures the difference of the results of the energy() and
// energy_unrolled() functions
void measure_difference()
{
    int iterations = 0;
    double relative_error_sum = 0;
    double relative_error_max = 0;

    for(double r = 1e-2; r < 100; r += 1e-2){
        double e = energy(r);
        double e0 = energy_unrolled(r);

        double difference = std::abs(e - e0);
        double relative_error = std::abs(difference / e);

        relative_error_sum += relative_error;

        if(relative_error > relative_error_max){
            relative_error_max = relative_error;
        }

        iterations++;
    }

    double relative_error_mean = relative_error_sum / iterations;

    std::cout << "max relative error: " << relative_error_max << std::endl;
    std::cout << "mean relative error: " << relative_error_mean << std::endl;
}

int main(void)
{
    measure_speed();
    measure_difference();

    return 0;
}
The results are:
$ g++ energy.cpp -o energy -O2
$ ./energy
energy time: 2.28 msec
energy_unrolled time: 0.16 msec
max relative error: 5.6035e-15
mean relative error: 1.28544e-16
We see that the unrolled version is significantly faster but introduces a mean relative error of ~1.3e-16.

Instead of manually optimizing the code, it would be great if we could get the compiler to optimize it for us. Looking through the g++ documentation turned up the -ffast-math compiler flag. It claims to increase the performance of floating-point math at the expense of breaking IEEE floating-point compliance and potentially reducing accuracy.

Compiling and running the test with the -ffast-math flag gives:
$ g++ energy.cpp -o energy -O2 -ffast-math
$ ./energy
energy time: 0.13 msec
energy_unrolled time: 0.14 msec
max relative error: 1.81079e-14
mean relative error: 9.84612e-17
So we can see that using the -ffast-math flag enables the same optimization that we performed manually.

Going back to the UFF force field I compiled the UFF plugin with the -ffast-math flag and re-ran the benchmark. It gave:
1,590 msecs per iteration
Using the -ffast-math option increased the speed by 88%, an even bigger improvement than just manually unrolling the pow() function.

For now I will leave the unoptimized energy function and let the user choose if they want to trade speed for accuracy by passing the -ffast-math compiler flag when compiling.

Friday, February 18, 2011

Counting heteroatoms in Python: try/except vs. if

I found this answer on the Blue Obelisk eXchange interesting and, from the comments, seemingly controversial. The answer states that in Python using try/except with an assert could be faster than just an if statement. Intuitively I felt that try/except would be much slower than a simple if statement but I wanted to know for sure. So I opened my editor, wrote some code, and found out which actually performed better.

I used the chemkit library's Atom::isHeteroatom() method to perform the counting of the heteroatoms. For the timing I used Python's timeit module.

The code reads the 753 molecules from the MMFF validation suite's test file MMFF94_hypervalent.mol2 and then uses the timeit.Timer class to measure the execution time of counting the total number of heteroatoms.

Here is the code:
import timeit
import chemkit

# Counts the number of heteroatoms in the molecule
# using an if statement
def heteroatom_count_if(molecule):
    count = 0

    for atom in molecule.atoms():
        if atom.isHeteroatom():
            count += 1

    return count

# Counts the number of heteroatoms in the molecule
# using a try/except statement
def heteroatom_count_try(molecule):
    count = 0

    for atom in molecule.atoms():
        try:
            assert(atom.isHeteroatom())
            count += 1
        except:
            continue

    return count

# Counts the total number of heteroatoms in the list
# of molecules using the given heteroatom_function
def count(molecules, heteroatom_function):
    count = 0

    for molecule in file.molecules():
        count += heteroatom_function(molecule)

    return count

if __name__ == '__main__':
    # read molecules from file
    file = chemkit.ChemicalFile("MMFF94_hypervalent.mol2")
    if not file.read():
        print 'Error reading file: ' + file.errorString()
        exit()

    # list of molecules
    molecules = file.molecules()

    # measure heteroatom_count_if
    t = timeit.Timer("count(molecules, heteroatom_count_if)", 
                     "from __main__ import count, molecules, heteroatom_count_if")

    print 'heteroatom_count_if time: ' + str(t.timeit(500))

    # measure heteroatom_count_try
    t = timeit.Timer("count(molecules, heteroatom_count_try)", 
                     "from __main__ import count, molecules, heteroatom_count_try")

    print 'heteroatom_count_try time: ' + str(t.timeit(500))
And here are the results:
$ python heteroatoms.py
heteroatom_count_if time: 8.65832996368

heteroatom_count_try time: 17.5534369946
So we see that the if statement version is roughly twice as fast as the try/except version. That was my intuition, but it's good to be sure.