Karin Kastehein

Estonian Information Society Yearbook 2011/2012


Скачать книгу

users. Submitting to the document register the query: https://dhs.riigikantselei.ee/avalikteave.nsf/contractsbydate?

      open&path=2011/12|Detsember, we get the response in the following form:

      <document noteid=”NT0017AE7E”>

      <field name=”date”>30.12.2011</field>

      <field name=”docid”>L11165</field>

      <field name=”subject”>Trükiste kujundamine ja trükkimine</field>

      <field name=”documenttype”>Töövõtuleping</field>

      <field name=”contractstartdate”>30.12.2011</field>

      <field name=”contractenddate”>20.01.2012</field></document>

      But the Government Office document register is a positive exception to the rule. Most registers of this type output HTML text that cannot be processed directly for re-use.

      Public services will not do away with the need to download. For the most part, Estonian public services lack search result download facilities, to say nothing of mashup options.

      Pursuant to the Public Information Act, the data processed in a database must be publicly available, if there are no access restrictions on them established by or deriving from law. But personal data in a database is not to be published unless there is an obligation to do so under law. Thus speed camera data, and incidents registered by the police, among other categories of information, should be public, as long as the personal data is redacted. Estonian public sector has mainly disregarded this requirement and the public part of registers that contain personal data have been left unpublished, to say nothing of their presentation in a reusable form.

      The options for re-using public data from registers that contain personal data are modest.

      Estonian open data is scattered in institutions. The most important generators of open data:

      • Land Board – geoportal: http://geoportaal.maaamet.ee

      • Environmental information centre: http://www.keskkonnainfo.ee

      • Statistics Estonia: http://pub.stat.ee/px-web.2001/dialog/statfile2.asp

      • National Library digital archives DIGAR: http://digar.nlib.ee

      • National Archive digital archive.

      Sample query to the Government Offi ce’s document register

      One route to making availability of open data simpler and better organizing presentation of open data would be the open data repository currently in pilot stage, http://opendata.riik.ee.

      Open data repository

      Uuno Vallner

      [email protected]

      Ministry of Economic Affairs and Communications

      Tanel Tammet

      [email protected]

      Tallinn University of Technology

      Aleksander Reitsakas

      [email protected]

      Aktors OÜ

      To improve availability of open data and coordinate posting of open data, a pilot interface has been developed for an open data repository at http://opendata.riik.ee. Metadata for the public sector’s open data should be uploaded there. Public departments and agencies can also upload data sets here if they so choose.

      Repository for open-source data

      From the standpoint of open data, what Estonia’s public sector needs the most are changes in legislation as well as organizational and technical principles. With this in mind, project Open Data Framework was launched for 2011-2012. A procurement was held to develop infrastructure for supporting making data open and for laying the organizational, technical and semantic preconditions for going open. The following were the planned outcomes of the procurement:

      • developing a website at opendata.riik.ee (beta) as Estonia’s information gateway for access to and use of open data;

      • creating infrastructure for publishing data (repository; beta);

      • specifying, in cooperation with open data communities, the preliminary organizational, technical and semantic requirements for open data;

      • the cloud solution CKAN was recommended to power the central repository (http://ckan.net);

      • the cloud-based Drupal search engine was recommended as the front-end system (http://drupal.org);

      • Apache SOLR was recommended as the search engine (http://lucene.apache.org/solr);

      • interfaces that support RDF and SPARQL standards were required;

      • a LAMP platform was required;

      • interoperability with other repositories was required;

      • it was assumed that institutions could establish their own repositories, but the central repository had to be capable of picking metadata from them;

      • it was presumed that institutions could load datasets directly to the central repository.

      Front page of the pilot application of the open data website

      The pilot version of the central open data site can be found at http://opendata.riik.ee. The site consists of three integrated systems:

      • A site for news, questions, discussions and manuals where manuals and news can be posted, questions brought up and discussions on the open data topic can be held.

      • A CKAN-based database of open data links, specifications and key metadata (see http://ckan.org), which can be linked to from the menu item Open Data on the site’s upper menu bar. The following can be retrieved from this database:

      1) open data can be searched and downloaded: without access restrictions;

      2) new open data can be added (registration and user privileges from administrator required).

      • A repository for datasets, which is one of the possible places where a government department can save open data.

      Technically, preconditions have been created for developing open-data infrastructure. But technical solutions are not enough. It will be necessary to staff and train a team to be capable of administering and developing infrastructure and performing supervision; their activity should also encompass public sector data generators as well as open-data communities that develop services.

      How to publish?

      In what format? The main principle is that it is much better to publish data in an inconvenient encoding than to not publish them at all on the consideration that it is planned at some unspecified time to improve the encoding. Secondly, a published dataset can always later be published in a new, better encoding.

      In the context of open data, we recommend evaluating the user-friendliness of formats and coding formats based on Tim Berners-Lee’s five-star system19 principles, which are described in the previous article. Publishing of datasets is best done in formats that can be opened and processed using freeware applications. This includes .odt format document files as well as some of the most common formats of structured data, such as .csv, json and .xml.

      Formats that can be opened and modified by freeware applications are well-suited to re-use.

      The use of one-star formats for opening data is to be avoided. But on the other hand publishing them as such is certainly better than not publishing them at all.

      Two-star formats