Simon Endele

Research Student in the PhD program from 01.06.2007 to 30.08.2008.


Prof. Dr. Ulrik Brandes

The approach Centering Resonance Analysis (CRA) developed by Corman et al. describes how to generate networks out of natural language texts. Centrality measures have been applied to these networks in order to find the most important words in a text with the objective of summarization.

We investigated the networks on the following utilization: Is it possible to retrieve several topics discussed in the underlying text from the CRA network?

For that purpose we tried to get adequate clustering methods for CRA networks. Particularly two approaches appeared to be useful: edge betweenness clustering and Newman's greedy modularity clustering.

We created artificial reference texts and visualized them with LiteratureVis to get an evaluable visual feedback. It has to be mentioned that the source texts have to be quite well-suited in terms of size and contentual heterogeneity.
Networks of very long texts appear to be too dense to be splitted up appropriately, then again too small texts result in decomposed networks. Besides it is very hard to find appropriate test examples and to evaluate the approach in various difficulty levels.


