The doctoral dissertations of the former Helsinki University of Technology (TKK) and Aalto University Schools of Technology (CHEM, ELEC, ENG, SCI) published in electronic format are available in the electronic publications archive of Aalto University - Aaltodoc.

Learning from Environmental Data: Methods for Analysis of Forest Nutrition Time Series

Mika Sulkava

Dissertation for the degree of Doctor of Science in Technology to be presented with due permission of the Department of Computer Science and Engineering for public examination and debate in Auditorium T2 at Helsinki University of Technology (Espoo, Finland) on the 18th of January, 2008, at 12 noon.

Overview in PDF format (ISBN 978-951-22-9154-0)   [351 KB]
Dissertation is also available in print (ISBN 978-951-22-9153-3)


Data analysis methods play an important role in increasing our knowledge of the environment as the amount of data measured from the environment increases. This thesis fits under the scope of environmental informatics and environmental statistics. They are fields, in which data analysis methods are developed and applied for the analysis of environmental data.

The environmental data studied in this thesis are time series of nutrient concentration measurements of pine and spruce needles. In addition, there are data of laboratory quality and related environmental factors, such as the weather and atmospheric depositions.

The most important methods used for the analysis of the data are based on the self-organizing map and linear regression models. First, a new clustering algorithm of the self-organizing map is proposed. It is found to provide better results than two other methods for clustering of the self-organizing map. The algorithm is used to divide the nutrient concentration data into clusters, and the result is evaluated by environmental scientists. Based on the clustering, the temporal development of the forest nutrition is modeled and the effect of nitrogen and sulfur deposition on the foliar mineral composition is assessed.

Second, regression models are used for studying how much environmental factors and properties of the needles affect the changes in the nutrient concentrations of the needles between their first and second year of existence. The aim is to build understandable models with good prediction capabilities. Sparse regression models are found to outperform more traditional regression models in this task.

Third, fusion of laboratory quality data from different sources is performed to estimate the precisions of the analytical methods. Weighted regression models are used to quantify how much the precision of observations can affect the time needed to detect a trend in environmental time series. The results of power analysis show that improving the quality may decrease the time needed for detection of the trend by many years.

The data analysis methods developed and applied in this thesis are found to produce results which are understandable for the environmental scientists. They are, therefore, useful for studying the condition of the environment and evaluating the possible causes for changes in it.

This thesis consists of an overview and of the following 7 publications:

  1. Juha Vesanto and Mika Sulkava (2002). Distance matrix based clustering of the Self-Organizing Map. In Dorronsoro, J. R., editor, Proceedings of the 12th International Conference on Artificial Neural Networks (ICANN 2002). Madrid, Spain, 27-30 August 2002. Lecture Notes in Computer Science, volume 2415, pages 951-956. © 2002 by authors and © 2002 Springer Science+Business Media. By permission.
  2. Mika Sulkava and Jaakko Hollmén (2003). Finding profiles of forest nutrition by clustering of the Self-Organizing Map. In Proceedings of the 4th Workshop on Self-Organizing Maps (WSOM 2003). Kitakyushu, Japan, 11-14 September 2003, pages 243-248. © 2003 WSOM'03 Organizing Committee. By permission.
  3. Sebastiaan Luyssaert, Mika Sulkava, Hannu Raitio, and Jaakko Hollmén (2004). Evaluation of forest nutrition based on large-scale foliar surveys: are nutrition profiles the way of the future? Journal of Environmental Monitoring, 6 (2): 160-167. © 2004 Royal Society of Chemistry. By permission.
  4. Sebastiaan Luyssaert, Mika Sulkava, Hannu Raitio, and Jaakko Hollmén (2005). Are N and S deposition altering the mineral composition of Norway spruce and Scots pine needles in Finland? Environmental Pollution, 138 (1): 5-17.
  5. Mika Sulkava, Jarkko Tikka, and Jaakko Hollmén (2006). Sparse regression for analyzing the development of foliar nutrient concentrations in coniferous trees. Ecological Modelling, 191 (1): 118-130.
  6. Mika Sulkava, Pasi Rautio, and Jaakko Hollmén (2005). Combining measurement quality into monitoring trends in foliar nutrient concentrations. In Duch, W., Kacprzyk, J., Oja, E., and Zadrożny, S., editors, Artificial Neural Networks: Formal Models and Their Applications, Proceedings of the 15th International Conference on Artificial Neural Networks (ICANN 2005). Warsaw, Poland, 11-15 September 2005. Lecture Notes in Computer Science, Part II, volume 3697, pages 761-767. © 2005 by authors and © 2005 Springer Science+Business Media. By permission.
  7. Mika Sulkava, Sebastiaan Luyssaert, Pasi Rautio, Ivan A. Janssens, and Jaakko Hollmén (2007). Modeling the effects of varying data quality on trend detection in environmental monitoring. Ecological Informatics, 2 (2): 167-176.

Keywords: data analysis, data mining, time series, forest, foliage, nutrient, environmental informatics, environmental statistics, environmental monitoring, clustering, self-organizing map, sparse regression, weighted regression

This publication is copyrighted. You may download, display and print it for Your own personal use. Commercial use is prohibited.

© 2008 Helsinki University of Technology

Last update 2011-05-26