You are on page 1of 15

Perspectives on the re-use of

data quality metadata


Tony Mathys (tony.mathys@ed.ac.uk) and Jo Walsh (jo.walsh@ed.ac.uk)

EDINA National Datacentre,

The University of Edinburgh

1
Table of Contents

Summary...............................................................................................................................................3
1 Introduction........................................................................................................................................3
1 ESDIN requirements and objectives regarding metadata...................................................................4
2 Challenges for data quality metadata..................................................................................................4
2.1 Perception challenge....................................................................................................................4
2.2 Policy challenge...........................................................................................................................6
3 Recommendations ..............................................................................................................................6
3.1 Automation tools and automating metadata validation and quality evaluation...........................6
3.1.1 Automation tools...................................................................................................................6
3.1.2 Automated validation............................................................................................................7
3.1.3 Existing Validators...............................................................................................................8
3.2 Simplification...............................................................................................................................8
3.3 Quality assurance for metadata and data quality descriptions.....................................................9
3.4 Consider Metaquality.................................................................................................................10
3.5 Establishment of a Centralised Support Organisation...............................................................10
4 Conclusions.......................................................................................................................................12
5 References.........................................................................................................................................13
6 Appendix A – list of data themes in INSPIRE Annexes I, II and III................................................14
6.1 Annex I themes..........................................................................................................................15
6.2 Annex II themes.........................................................................................................................15
6.3 Annex III themes.......................................................................................................................15

2
Summary
We provide some background to the discussion of quality of geographic information – why data
quality measurement and metadata is of such significance.

We reflect on the experiences within the communities preparing for the INSPIRE Directive, and
within the ESDIN best practise network which is helping National Mapping and Cadastral Agencies
prepare for INSPIRE.

We outline some of the challenges, of both policy and implementation, that may be hindering the
creation and sharing of data quality metadata. We identify several key areas in which research may
be progressing and where some investment now may reduce future cost burdens for geodata quality
- in automation, automated testing and quality assurance, sharing processes that have led to data
simplification, and consistent metaquality.

We go on to make a recommendation for a centralised agency to curate, assure and audit metadata
and data quality metadata made available under the terms of the INSPIRE Directive.

1 Introduction

The importance of data quality metadata for INSPIRE or any collaborative spatial data project
cannot be overstated. The remit of the INSPIRE Directive1 is to “establish an infrastructure for
spatial information in Europe to support Community environmental policies, and policies or
activities which may have an impact on the environment.”

The environment has a direct impact on the daily lives of millions of EU citizens. Data modelling
becomes an integral part of planning for the prevention of, and preparation for, natural disasters
which can pose serious risks to people, property and livelihoods. Experts building these models must
be confident in the authenticity of the data within the models, especially when situation awareness
becomes critical and time is of the essence to make decisions which can affect outcome. Inter-
dependencies between environment, infrastructure and population create a situation where multi-
criteria decisions become even more sensitive to accuracy of data and metadata.

This reality creates the impetus for data developers to embrace metadata and ensure data quality
information is included so that evaluation of the data can be done effectively and expediently to
ensure that potential risks can be identified and addressed, or reduced in the event of a disaster.
Faulty decisions based on poor quality data will lead to loss of life; are data developers always alive
to this?

Another reality seems pervasive across the GI community. Despite the recognised environmental
risks, the uptake of data quality metadata does not appear to take high priority in terms of data
management or sharing. ESDIN Work Package 8 is attempting to address this with the publication of
guidelines to support the documentation of data quality across National Mapping and Cadastral
Agencies (NMCAs) across the European Union.

This white paper introduces the challenges the ESDIN data quality guidelines face and offer possible
solutions to encourage the incorporation of these guidelines into INSPIRE, and uptake across the GI
communities across the EU.

1
http://inspire.jrc.ec.europa.eu/
3
1 ESDIN requirements and objectives regarding metadata
The ESDIN project (www.esdin.eu) is a best practice network in the eContentPlus programme,
helping National Mapping and Cadastral Agencies (NMCAs) to implement the data specifications
required by the INSPIRE Directive, focusing on Annex I themes.

ESDIN has focused on the following INSPIRE Annex I themes, all topographical:

 Coordinate Reference Systems,


 Administrative Boundaries,
 Cadastral Parcels,
 Hydrography,
 Transport Networks, and
 Geographical Names.

These data themes are mainly the responsibility of the National Mapping and Cadastral Agencies in
each country, and accordingly the business model focuses on the activities of the NMCAs. ESDIN
Work Packages 6 and 7 have extended some of the INSPIRE data specifications to better meet the
needs of data curators at the NMCAs.

ESDIN Work Package 8 has been responsible for common guidelines on metadata describing
geographic data – not only its properties, but its provenance, and crucially, its quality, where data
quality can be measured and communicated in a standard way.

INSPIRE has involved the creation of common models which can be used across Europe by different
data providers. Common concepts for metadata about geographic information, aid end-users in
discovering data that may be useful for their purposes and evaluating that data to see if it will be
adequate for their purposes. Thus ESDIN WP8 has developed guidelines for the creation of
discovery metadata and evaluation or explorative metadata.

“Discovery Metadata” (Nebert, 2000) is the minimum amount of information that needs to be
provided to convey to the user the nature and content of the data resource. “Evaluation or explorative
metadata” provides sufficient descriptive information about the data to enable the user to ascertain
its fitness for purpose. Evaluation metadata includes information about data quality, on which we
focus here.

2 Challenges for data quality metadata


2.1 Perception challenge

NMCAs have challenges to face in accepting the ESDIN Metadata Guidelines for the measurement
and documentation of spatial data quality. The reality of partial, incomplete metadata and the
challenge involved in curating and creating more of it, provide the impetus for the recommendation
and consideration of business model; means by which to encourage and support the NMCAs to
incorporate data quality statements into their data management and sharing practices.

As it stands, there are few incentives for data developers to create metadata; this pertains to search
and response levels of metadata, the latter involving the inclusion of quality statements for their
datasets. The conceptualists proclaim their beliefs in the importance and benefits of metadata
creation; however, data developers do not share this enthusiasm though acknowledge that data
quality information exists as part of their internal data management practices.
4
This is only a general perspective which does not address the other more complex challenges
associated with the cultural and national diversities existing across the EU; papers presented at the
annual INSPIRE conference illustrate this reality. Some countries report collaboration across various
sectors in a country, between different government authorities, or between countries; others voice
intent to collaborate; then others indicate a complete absence of coordination.

The INSPIRE conferences also reveal a plethora of data harmonisation project initiatives outwith
ESDIN to support cross-border data sharing; the inclusion of data quality elements for INSPIRE is
one focus. The Data quality and metadata for evaluation and use within INSPIRE workshop at the
INSPIRE 2010 Conference in Krakow, Poland revealed issues which confirm the aforementioned
realities with regards to an absent embrace of data quality in metadata2.

A discussion paper on quality requirements of INSPIRE was prepared with the following objectives:
• find evidence whether specifying data quality requirements are appropriate for INSPIRE;
• if so, propose data quality elements, measures, and target values;
• fix how metadata on data quality has to be presented
• provide guidance on data quality requirements and metadata for the sets of data themes in
INSPIRE Annexes I, II and III;
• formulate proposals amending the INSPIRE data specification template, if appropriate;
• raise awareness about the role of data quality and metadata in spatial data infrastructures.

The discussion paper had been disseminated for consultation to Member States via their nominated
contacts. Specific questions pertaining to data quality and metadata for evaluation and use were
included in the paper; the comments were reported for discussion at the INSPIRE Quality workshop
in Krakow.

The discussion paper would have reached the 27 Member States, but only 15 of those were
represented at the workshop, and of these 15, only several offered any collective feedback; most
provided a personal perspective as there wasn’t time or response from authoritative sources from
their respective countries.

A general discussion amongst the 50+ people ensued at the workshop which revealed the
ambivalence towards the inclusion of data quality elements in INSPIRE; there was also a degree of
uncertainty voiced with regards to Conformity and Data Transformation presented in the discussion
paper. More wasn’t seen as better with regards to metadata; in general, it’s viewed as a labour-
intensive and tedious activity. There are other sources which confirm that this attitude is pervasive
across all GI communities (West, Hess, 2002) (Mathys, 1999). A survey of archaeological
organisations in the Republic of Ireland revealed that 57.1 percent of these organisations did not
include metadata creation as part of their organisations’ data management strategy (INSTAR Final
Report, 2008). The Open Geospatial Consortium (OGC) Working Group on Data Quality conducted
a survey in 2008 which revealed that about 55.8 percent of respondents said they were not using any
recognised standards for data quality work being conducted in their organisation3. Academia appears
to represent a greater challenge based on engagement with that community in the United Kingdom
(Mathys, 2004).

2
http://gogeo.blogs.edina.ac.uk/2010/07/16/inspire-2010-conference-22-25-june-krakow-poland/
3
http://www.opengeospatial.org/pressroom/pressreleases/911
5
2.2 Policy challenge

Many participants at the INSPIRE conferences represent the public sector across the EU.
Considering the importance of spatial data quality for supporting planning, emergency response and
public infrastructure mapping, there can be a strong argument and justification made at policy level
for the documentation of data quality within government. The challenge is to induce uptake of
recommendations where there is no formal mandate (as there is for the common metadata profile in
the INSPIRE Metadata Implementing Rules).

What policy should EU governments take to guarantee both that metadata is of a sufficient quality to
ease evaluation and re-use, and that data quality metadata is of a sufficient consistency to leave end-
users feeling assured that datasets are fit for their purposes?

Over the years we have collected evidence at EDINA on the level of uptake of metadata standards in
academic institutions. A 2006 spatial data audit conducted at four UK academic institutions yielded
500+ spatial dataset titles and hundreds more orphan datasets, those recognised as GIS files, but with
no provenance4 5. Many of these spatial datasets identified in the audit and countless more created in
academia would probably fall under the INSPIRE Annex III Theme.

See Appendix A for a list of the different data themes falling under each INSPIRE Annex - the later
Annexes imply a later date of legal obligation to make metadata and data access services available.
Datasets in Annex II and III could be relevant to some of the ESDIN targeted INSPIRE Annex I
topographical data themes, and the provision of Annex I theme datasets will be essential to do
automated quality measurement of datasets falling under other themes. Spatial datasets created
within academia, as by companies or community “crowdsourcing” projects such as OpenStreetmap,
can be of supplemental use to the NMCAs, especially if their quality and the improvement in relative
quality over time can be consistently described.

Given the existing budgetary constraints many NMCAs now face, a change of role to provide more
quality assurance and data aggregation, than actual curation, seems necessary or at least beneficial.
We consider and propose measures which will not only support the creation of metadata, but also the
inclusion of data quality statements. A key consideration is that there is a limit to what each NMCA
can achieve, in terms of process efficiency gains and cost savings, by acting alone; this is essentially
a problem of co-ordination which requires a collective solution.

The absence of a centralised EU/INSPIRE metadata organisation leaves few options to encourage
the support of providing data quality statements in metadata. There are the individual recommended
organisational services described before which could be considered and implemented. The online
spatial data validator could be very useful, and certainly tools for metadata automation, but for both,
considerable investment and cost/benefit analysis of the impact of that investment is required.

3 Recommendations
3.1 Automation tools and automating metadata validation and quality evaluation

3.1.1 Automation tools

Automation is seen as a panacea to data quality metadata, but more investment of resources in
research and development needs to be done to achieve the benefits (cost savings and process
efficiency gains) of automatic generation of metadata about data and its quality. There are notable
4
http://tiny.cc/tfigu
5
http://tiny.cc/el350
6
efforts towards meeting this goal including the development of customised utilities embedded within
a common GIS application to derive information (Batchellor, 2008). Olfat, H. Rajabifard, A. and
Kalantari, M.(2010) propose a process of synchronisation of the record with the dataset. This would
ensure that processes and updates applied to the dataset would be recorded in the embedded
metadata record.

Automation may not be able to extract all data quality information, but must be investigated and
pursued further to provide the tools to simplify metadata creation. The role of shared schemas,
ontologies or thesauri is critical here, where logical consistency across datasets must have reference
to a common thesaurus. The inclusion of basic automation and semi-automation tools removes some
of the tedious manual efforts, which in turn might encourage data developers to direct more attention
to measuring data quality and describing it in metadata? Tools for capturing data extent co-ordinates
for bounding boxes, extracting extents from place name keywords and the automatic populating of
contact detail fields are a few basic examples of semi-automation tools for dataset metadata (Mathys
and Reid, 2009).

3.1.2 Automated validation

Automated validation is another step forward which can encourage support and uptake of data
quality metadata.

We suggest, as an outcome from the ESDIN project, that work begin on an online spatial data
validator for INSPIRE Annex I data developers (NMCAs). The NMCAs are assumed to be the
custodians of highest quality base map data. This data would serve as the primary source for the
validation of datasets produced internally by other organisations, public authorities or contarctors
working with them. As part of good data management, derived datasets contain updates to or
generalisations of core NMCA datasets, should have information about the changes made and the
sequences of operations used to make changes, recorded in the associated metadata.

An online validator should be designed to support the following data quality elements:

• logical consistency,
• positional accuracy,
• completeness,
• temporal integrity, and
• conformance.

The benefit of establishing an online validator can also be extended to end users of NMCA data.
Users will derive data, use datasets in an application, or supplement them with new information or
features for an application. The online validator offers end users of NMCA data the opportunity to
submit their new datasets for validation against these authoritative NMCA data sources, to ensure
the integrity remains the same, and with this result, receive certification, or a pedigree, which can be
included in the new dataset’s metadata record; the data quality information would be retained,
though amended to reflect modifications made to the dataset which deviate it from original data
source.

A dataset with new information or features could still be validated for logical consistency, but might
require an national-level agency to conduct field checks or examine aerial photos as part of quality
assurance control to ensure a dataset’s authenticity. The benefit to the NMCA would be measured in
terms of staff time and cost dedicated to mapping new features. An example would be an access road
added to an NMCA’s transport network dataset.

7
Then main purpose of establishing a validator for metadata would be to ensure retention of the
primary dataset’s data quality pedigree which would be held in metadata records created for new
datasets. This scenario would mean more metadata records retaining data quality information, which
would demonstrate the value of data quality and gradually change the mindsets of data developers
and users across the wider European GI community.

A cost saving measure could be the creation of an online validator with a centralised geodata library
holding NMCA base map data from all the countries. Users would select the relevant base map data
to run the validation. This would relieve the NMCAs from taking direct responsibility for
establishing the support infrastructure for the validation tools.

3.1.3 Existing Validators

1Spatial’s Radius Studio6 is an example of a desktop-oriented, post-facto metadata and quality


validator for spatial data sets and products. It would be of benefit to consider its potential use value
as an external hosted or managed service, in parallel to its use in internal production processes.

The Java OpenStreetMap Editor (JOSM)7 is an editor client for the OpenStreetmap collaborative
mapping project. JOSM has a “Validator” plugin which can also inform a model for an online
validation service. JOSM Validator is particularly interesting in that it works at commit-time, e.g.
data quality, at least at the level of logical and topological consistency, is being measured and
reported on even as surveyors commit data to a central repository.

3.2 Simplification

The data quality statement should be a mandatory requirement for ESDIN’s targeted INSPIRE
Annex I Theme’s topographical data. Important buildings, governmental facilities, natural risk zones
of the INSPIRE Annex III Theme may be considered as well, but for most other data themes under
INSPIRE Annex II and Annex III may benefit from the simplification of some of the ESDIN
Metadata Guideline elements.

Positional accuracy, completeness, scale and logical consistency elements could be assigned lists for
INSPIRE Annex II and III Theme data developers to select values for metadata creation. INSPIRE
Annex II Theme geological data and INSPIRE Annex III Theme atmospheric, meteorological and
oceanographic data are examples where the ESDIN Metadata Guideline elements could be versioned
to include lists. These data themes may not require the same precision as Annex I Theme transport
networks.

As an example, the positional accuracy element could have a range of values which correspond to
collection methods and application areas as presented in Table I taken from a Chesapeake Bay
Program (2007) report which provides this geospatial accuracy and precision tier as part of a data
quality assurance project plan.

Table 1.
Accuracy and Examples of Horizontal Collection Example CBP
Tier Level
Precision Method Application Area
Classical Surveying Techniques; plus GPS Generally not applicable
Tier 1 <1 m
Carrier Phase Static Relative Position to CBP projects

6
http://www.1spatial.com/products/index.php?ov=2#1282060502167_1/5
7
http://wiki.openstreetmap.org/wiki/JOSM/Plugins/Validator

8
GPS Carrier Phase Kinematic Relative Generally not applicable
Tier 2 1–5m
Position to CBP projects
GPS Code (Pseudo Range) Standard CBP Shallow Water
Tier 3 6 – 25 m
Position Monitoring Data
GPS unspecified; Photo/GIS Interpolation SAV mapping, land cover,
Tier 4 26 – 100 m
mapping
Tier 5 101 – 200 m Urban style address matching Toxics, Point Source
Public Land Survey – Sixteenth Section Protected Lands
Tier 6 201 – 999 m
Boundaries
Address Matching – Block Face Coarse Scale geographic
Tier 7 1000 – 2000 m
targeting tools
Watershed (HUC11)
Tier 8 2001 – 5000 m Census Block Centroid statistical summaries and
indicators
Airshed impacts, Priority
Tier 9 > 5000 m Zip Code Centroid
Living Resource Areas
Tier 10 Unknown N/A Relative contextual data

The Ordnance Survey (OS) have applied simplification to spatial data relationships between feature
objects using ontologies.8 This may be relevant as well to a simplification scheme to define
positional accuracy.

• Completeness (commission or omission) could be represented as a list of percentage values


ranging fom 0 to 100 percent.
• Scale (source scale denominator) values could also be represented in a list of value ranges from
large to small scale.
• Logical consistency could represent a list of tests, confidence levels or compliance with
topological rules.

3.3 Quality assurance for metadata and data quality descriptions

Data quality metadata itself has quality which can be measured; the standards emerging in ISO have
a consideration of metaquality. Quality assurance should logically result in some form of
certification that the quality assurance has been done and is acceptable; this is a key element in the
usefulness of data quality metadata for end-user evaluation purposes. Of course it is difficult to be
concrete about “fitness for purpose” as the variety of end-user applications is so wide, and novel
applications may always arise that are not at all envisioned by the original data provider.

Yet, a certification of quality assurance (that the quality of the dataset is at an acceptable level, and
that the metaquality of the quality measurements is reliable and comparable to measurements of
other related datasets) should be transferrable between different end user purposes that cannot be
envisaged in advance.

An assured standard for highest-quality data is needed for evaluation, in order to test and measure
more than logical consistency in an automated way. Arguably, the question of how to make useful
aggregates of measurements, and how to compare aggregates over generalised datasets to aggregates
over their original sources, are still active research areas of some contention.

Some benchmark is needed even if it is accepted that the results will be always relative – there may
be no consistency of the consistency of quality, thus is it important to consider metaquality.

8
http://www.ordnancesurvey.co.uk/ontology/SpatialRelations.owl
9
3.4 Consider Metaquality

Metaquality is defined by the current draft of ISO 19157, Geographic information – data quality as
constrained to a range of three things describing data quality metadata. The Confidence in the
quality of the quality measurements; the Representativity of any samples tested for quality, of the
quality of everything in the dataset; the Homogeneity of the rest of the dataset relative to the
samples.

Metaquality is important where data quality evaluation results need to be compared across borders.
One aim of the ESDIN data quality model and metadata guidelines should be to establish consistent
metaquality across data sets published by European member state NMCAs. When data products are
combined then we know we are comparing like with like. Matt Beare’s presentations to ESDIN WP8
and ISO TC 211 have emphasised quality measurement as a continuous process during the lifecycle
of data from the hand of a surveyor into a downloadable product. A push for consistent metaquality
early on in the process will save time, and cost, aligning and merging different data sets later – an
important consideration at the European level, with regular updates to hundreds of “authoritative”
sources.

Knowledge about the quality of representativity and homogeneity is only useful to us where quality
measurement has been done by sampling rather than by full inspection; where ground surveying is
needed to verify some measurements (absolute positional accuracy); where the correctness of
properties has to be assessed by eye in the absence of a gazetteer or ontology that can be viewed as
definitive for a section of the universe of discourse.

Right now the ISO specification envisages metaquality elements as Character Strings, human-
readable text. A consistent representation of measured quality of confidence would be of benefit.
Aggregation of metadata quality elements faces a similar problem, in fact the two problems could be
usefully combined. An aggregation of the range of quality measurements is the sensible way to
derive an overall assessment of confidence in the measurements, so the describable quality of the
aggregation is the same, in some sense, as the confidence in the quality measurements as a whole –
movements in the one will always affect the other.

Metaquality, metadata automation, validation and the simplification of data quality metadata offer
key steps forward to the support and update of the ESDIN Data Quality Guidelines; however, these
require considerable investment and commitment. A cost-effective approach to advance these and
the requirements of the INSPIRE Directive would be to establish a centralised support structure
which could provide these resources and coherent policies for those implementing INSPIRE.

3.5 Establishment of a Centralised Support Organisation

The INSPIRE Directive will prove successful if there is a commitment to the establishment of a
committee or organisation dedicated to providing support for implementation of INSPIRE across the
EU. Many funded projects, INSPIRE itself, depend on regular attendance and engagement from
nominated or appointed contacts. Revelations from an anonymous INSPIRE committee member
indicates that this arrangement is problematic because from one meeting to the next there will be
replacements for those initially designated as participants, hence inconsistencies and revisions to
documents and policies associated with the project or initiative. Good co-ordination is critical within
these working groups to ensure consistent delivery of deliverables and targets.

10
The revised, amended and updated documents on the INSPIRE website are a testament to the need
for more co-ordination within and between these various working groups. Failure to address this will
only create more uncertainty and less inclination towards supporting and implementing policies.

While this isn’t a criticism of INSPIRE, this points to the importance of establishing a permanent
and stable EU/INSPIRE metadata (SDI) organisation which will take responsibility for providing the
resources to support metadata creation; this would include data quality, which then could be
extended to target spatial data falling under INSPIRE Annexes Theme II and Theme III.

Each Member State would contribute funds to establish this organisation; it could also be contracted
out to a private sector company which offers specialised staff and resources to support metadata
creation activities including the following:

• Implement basic policies for metadata implementation and creation. An example would be
permitting NMCAs to create ‘search’ level metadata for national SDIs, but delivering ‘response’
level metadata to those requesting more information about the dataset, or requesting the dataset.
This policy could extend to other data providers of targeted Annex II and Annex III theme data as
well, but this is an indirect approach where data developers could assume that there will be no
requests for their datasets or information about their data. It’s probable that surveyors and civil
engineers document dataset creation, but not with the use of metadata tools. Staff from
organisation can also promote the importance of metadata creation, most notably to administrators
responsible for NMCAs and other data developers. A stronger case can be made to administrators
with regards to metadata creation against cost and time lost through staff turnover. Data quality is
especially important with regards to good data management practices and establishing a tracking
network for data. This is especially important for temporally sensitive spatial data which are used
in transport, health and emergency applications. Data quality becomes very important because
changes to datasets without documentation and date stamps can pose risks to the public. This can
be appreciated at the administrative level which has direct contact with the public, hence more
inclined to impose this practice rather than a GIS manager.

• Establish consistent metaquality across data sets published by European member state NMCAs.

• Approach publishing houses to impose conditions on authors for the publication of their articles
and books. Spatial data presented in the publication would need to be deposited in a repository and
metadata records published on a geoportal.

• Provide onsite metadata training which includes introduction and review of INSPIRE and data
quality elements. Attendees would be required to bring key spatial datasets for documentation at
the workshops.

• Offer interactive support to allow data developers the opportunity to submit their spatial data to
organisation for partial metadata creation which would be returned to data developers for
completion, or invited for further interrogation from organisation’s staff. The staff will know how
to ask pertinent questions to complete metadata records.

• Develop online eLearning objects for data developers to reference. These could be modules which
provide a metadata context through various scenarios of spatial data capture and processing steps,
which can be cross-referenced to equivalent metadata elements.

• Assume responsibility for the quality assessment for metadata records submitted for publication on
national portals, especially for INSPIRE-compliant metadata.

• Conduct spatial data audits to know where to target support efforts.


11
• Interaction with academic institutions to ensure students are introduced to the importance of
metadata and INSPIRE Directive.

• Support software engineers in the development of automation tools to extract information from
spatial datasets to populate metadata record fields.

Countries could have the option of not contributing to this centralised service, but would then be
required to deliver quality metadata within a predefined time frame. Countries failing to do this
would incur penalties.

The benefits of establishing a centralised EU/INSPIRE metadata organisation are many. This would
provide a specialised and dedicated staff committed to ensuring compliance of INSPIRE through
these aforementioned services for data developers. It also guarantees that the resources are there to
support the inclusion of data quality statements in metadata records, which brings ESDIN’s Work
Package 8 to fruition.

The Federal Geographic Data Committee (FGDC) operates as an interagency committee that
promotes the coordinated development, use, sharing, and dissemination of spatial data across the US.
The FGDC also provides the support of the establishment of a National Spatial Data Infrastructure
(NSDI) through funding, training and engagement with various sectors and governmental agencies.

Consistency and quality will be the order for SDIs across Europe. Success in the implementation of
the INSPIRE Directive is critical and providing a centralised organisation can deliver this and be far
more cost-effective than at the Member State level where inconsistency will prevail and priorities
focused on other matters other than metadata implementation and data quality statements.

4 Conclusions
Implementation of the INSPIRE Directive for the 27 Member States of the EU represents a
considerable achievement - a testament to the efforts of those involved in the process. The ESDIN
project takes the next step in addressing data harmonisation for cross-border data sharing. INSPIRE
Conference plenary speakers have noted the importance of this reality; Max Craglia noting again at
the INSPIRE 2010 Conference at Krakow, Poland9 that 115 million people live within 50km of a
border, and another 60 million within 25km.

The creation of the ESDIN Data Quality Guidelines represents an important step forward for the EU
if incorporated into the INSPIRE Guidelines. It seems critical that planning departments and other
relevant environmental agencies have confidence in data shared across borders. This can be achieved
only through the inclusion of a data quality statement with each metadata record.

The reality is that despite the apparent benefits, the recognised costs in terms of lost lives,
environmental impact and damage to infrastructure and personal property, metadata is still seen as
tedious and time-consuming; some will argue that their budgets restrict them from creating metadata.

This white paper recognises the importance of data quality and identifies possible solutions to
encourage its uptake across all NMCAs; it even proposes that these measures can be extended to
ANNEX II and ANNEX III Theme data through the use of automation, validation, simplification
and centralised services; the three former implemented and supported through the latter.

9
http://inspire.jrc.ec.europa.eu/events/conferences/inspire_2010/presentations/1002_pdf_presentation.pdf
12
Referring back to Max Craglia’s plenary presentation, he summarised the investment costs for a
reduced scope of INSPIRE across the EU. He calculated that the total investment per annum over 10
years in millions of euros and provided the following sums:

EU level: 1.9 million


National Organisations: 13 million
Regional and Local: 77- 122 million

There is a breakdown in costs for these figures and metadata activities were included.

EU level: 200,000 euros


National Organisations: 1.9-2.2 million euros
Regional and Local: 33 million euros

Perhaps an argument can be made that it would be more cost-effective to establish a centralised
metadata service for providing policy and support to the EU Member States? A centralised service
would provide a full-time staff dedicating their time, resource and energy towards providing support
for the implementation of the INSPIRE Directive. This scenario would achieve the inclusion of the
ESDIN Data Quality Guidelines into INSPIRE with the notion from data developers that a
centralised service would provide the necessary support for them to provide evaluation level
metadata without the commitments as required under the current arrangement. It might be feasible
for centralised service to have offices within each EU Member State to co-ordinate metadata
activities at that level.

A centralised service could also consider and negotiate partnerships between public and private
sectors for the delivery of some services or tools. The service could also engage and support
academic research. Academic research has been directed at metadata automation and ontologies for
many years with numerous publications as a result.

The establishment of a centralised service does not fall under the remit of the ESDIN project
requirements, but could be the starting point for discussion. This is one suggestion of several
proposed to take the ESDIN Data Quality Guidelines forward. The implementation of the INSPIRE
Directive and the establishment of SDI’s across the EU depend on data quality; this goes back to the
initial objective of INSPIRE, which is to support Community environmental policies, and policies or
activities which may have an impact on the environment. Data quality is critical to ensure this, but
there are barriers which need to be addressed; this paper hopefully offers a starting point of reference
to achieve this?

5 References

Nebert, D. (2000). Developing Spatial Data Infrastructures: The SDI Cookbook, Version 1.0,
http://www.gsdi.org/pubs/cookbook/chapter03a.html

Batcheller, J.K. (2008). Automating geospatial metadata generation—An integrated data


management and documentation approach. Computers & Geosciences 2008 – ELSEVIER, pp.387–
398.

Chesapeake Bay Program. (2007). Geospatial Data Quality Assurance


13
Project Plan: Standard Operating Procedures for Managing Geospatial Data.
http://www.chesapeakebay.net/content/publications/cbp_33365.pdf

Irish National Strategic Archaeological Research (INSTAR). (2008). INSTAR Final Report: Spatial
Heritage & Archaeology Research Environment I.T. Feasibility Project. Dublin, Republic of Ireland.

Olfat, H. Rajabifard, A. and Kalantari, M. (2010). Automatic Spatial Metadata Update: a New
Approach. Coordinates Magazine.

Mathys, T. (1999). Metropolitan GIS: The Minnesota Metadata Mission, Geo Info Systems
Magazine. November.

Mathys, T. (2004). The Go-Geo! Portal metadata initiatives. In: Proceedings of the Geographical
Information Science Research UK 12th Annual Conference, University of East Anglia, Norwich, UK,
pp. 148–154.

Mathys, T. and Reid, J. (2009). Three use case scenarios for geospatial metadata automation. In:
Automatic Metadata Generation: Use Cases and Tools/Priorities Synthesis Report on Automated
Metadata Generation and its Uses. Intrallect, Edinburgh, Scotland, UK.

West Jr., L.A., Hess, T.J. (2002). Metadata as a knowledge management tool: Supporting intelligent
agent and end user access to spatial data. Decision Support Systems, 32 (3), pp. 247-264.

6 Appendix A – list of data themes in INSPIRE Annexes I, II and III


http://inspire.jrc.ec.europa.eu/reports/ImplementingRules/DataSpecifications/D2.3_Definition_of_A
nnex_Themes_and_scope_v3.0.pdf

14
6.1 Annex I themes

• Coordinate reference systems


• Geographical grid systems
• Geographical names
• Administrative units
• Addresses
• Cadastral parcels
• Transport networks
• Hydrography
• Protected sites

Note that ESDIN only covers the following Annex I data themes at large scale: Geographic names,
Hydrography, Cadastral parcels, Transportation networks, Administrative units.

6.2 Annex II themes

• Elevation
• Land cover
• Orthoimagery
• Geology

6.3 Annex III themes

• Statistical units
• Buildings
• Soil
• Land use
• Human health and safety
• Utility and government services

15

You might also like