| Name: | Description: | Size: | Format: | |
|---|---|---|---|---|
| 2.16 MB | Adobe PDF |
Authors
Abstract(s)
This report documents a mammal species visualization project in Portugal. Knowing where
these species occur and the use of the land in which they are found is essential so that we can
protect these species that are important for the balance of our ecosystems. To increase biodiversity
literacy in Portugal, a BI system (Business Intelligence) was developed that supports analysis of
open data from the GBIF (Global Biodiversity Information System) and land uses from the SNIG
(System National Geographic Information).
Therefore, the first objective was to understand concepts not only in the field of IT, specially
about dimensional modeling, but also in the field of Biology, in particular the Darwin Core metadata structure used by GBIF. In order to achieve this objective, scripts were developed to extract
data from GBIF and perform statistical analysis thereof. Geographical data on land use was also
analyzed and visualized and the possibility of this data being integrated with that of GBIF was
confirmed.
The second objective of the project was to develop an easy-to-use dimensional model that supports a variety of analyzes relating to mammal occurrences in Portugal, based on dates, taxonomy,
geography and registration. For this, an ETL (Extract, Transform and Load) system was built that
extracts data from GBIF and SNIG to an data staging area, where transformations are made to
organize and integrate the data and make it more intelligible, and this data is then uploaded to a
data presentation area, accessible to users.
The third objective was to demonstrate the possibilities of using the dimensional model with
the use of BI technologies, through interactive visualizations, for example a map that allows you
to observe the location of mammals or a bar graph that shows the number of species categorized
by threat level (vulnerable, endangered, among others). In order to evaluate the work developed,
assessments were made of the data quality and execution times of each script. Part of the visual analysis evaluation was done by an expert who provided a favorable opinion regarding the
visualizations.
Description
Tese de Mestrado, Engenharia Informática (Engenharia de Software), 2024, Universidade de Lisboa, Faculdade de Ciências
Keywords
Modelação dimensional ETL Visualização de informação GBIF Usos do solo Teses de mestrado - 2024
