Publications
Back to Publications
| Author(s) |
Di Fatta, G., Berthold, M. |
| Title |
Distributed mining of molecular fragments |
| Abstract |
In real world applications sequential algorithms of
data mining and data exploration are often unsuitable for
datasets with enormous size, high-dimensionality and complex
data structure. Grid computing promises unprecedented
opportunities for unlimited computing and storage
resources. In this context there is the necessity to develop
high performance distributed data mining algorithms.
However, the computational complexity of the problem and
the large amount of data to be explored often make the design
of large scale applications particularly challenging. In
this paper we present the first distributed formulation of
a frequent subgraph mining algorithm for discriminative
fragments of molecular compounds. Two distributed approaches
have been developed and compared on the wellknown
National Cancer Institute's HIV-screening dataset.
We present experimental results on a small-scale computing
environment. |
| Download |
BeDi04.pdf |
Back to Publications
|