Python module for the automated and metadata-based collection, ingestion and formatting of raw EU data from national providers.
| documentation | available at: ... |
| status | since 2020 – in construction |
| contributors |
|
| license | EUPL |
Quick install and start
Once installed, the module can be imported simply:
>>> import pyeudatnatNotebook examples
- A basic example regarding healthcare services to start with the module.
- ...
Usage
You will need first to create a special class given the metadata associated each the national data:
>>> from pyeudatnat import base
>>> NewDataCategory = base.datnatFactory(cat = 'new')Following, it is pretty straigthforward to create an instance of a national dataset:
>>> datnat = NewDataCategory()
>>> datnat.load_data()
>>> datnat.format_data()
>>> datnat.save_data(fmt = 'csv')Note the output schema (see also "attributes" in the documentation below) should be defined outside, e.g. in an external config.py file.
- Various possible geocoding, including
GISCO.
Default coder is GISCO, but you can use a different geocoder also using an appropriate key:
- Automated translation,
- ...
Software resources/dependencies
- Packages for data handling:
pandas. - Packages for geocoding:
geopy,pyprojandhappygisco. - Package for JSON formatting:
geojson. - Package for translations:
googletrans.