pydst package¶
Submodules¶
pydst.pydst module¶
This module powers the DstSubjects class that is the workhorse to obtain subjects and subjects from Statistics Denmark.
-
class
pydst.pydst.
Dst
(lang='en')[source]¶ Bases:
object
Retrieve subjects, metadata and data from Statistics Denmark.
This class provides some simple functions to retrieve information from Statistics Denmark’s API.
-
lang
¶ Can take the values
en
for English orda
for DanishType: str
-
get_csv
(path, table_id, variables=None, lang=None)[source]¶ Save table_id as csv
Parameters: - path (str) – Outputdirectory
- table_id (str) – Table ID for the the table you want to retrieve data from.
- lang (str, optional) – If lang is provided it uses this argument
instead of the Dst’s class attribute lang. Can take the values
en
for English orda
for Danish - variables (dict, optional) –
Returns: Doesn’t return anything.
Return type: None
Todo
- Implement tests
- Ensure that variables (dict) can only take lists as inputs that is entirely filled with strings
- Ensure that variables (dict) can take string as values
-
get_data
(table_id, variables=None, lang=None)[source]¶ DataFrame with variables contained in table_id
Parameters: - table_id (str) – Table ID for the the table you want to retrieve data from.
- lang (str, optional) – If lang is provided it uses this argument
instead of the Dst’s class attribute lang. Can take the values
en
for English orda
for Danish - variables (dict, optional) –
Returns: Returns a DataFrame with data from table_id.
Return type: pandas.DataFrame
Todo
- Implement tests
- Ensure that variables (dict) can only take lists as inputs that is entirely filled with strings
- Ensure that variables (dict) can take string as values
-
get_metadata
(table_id, lang=None)[source]¶ DataFrame with metadata about table_id
Parameters: - table_id (str) – Table ID for the the table you want to retrieve data
- from. –
- lang (str, optional) – If lang is provided it uses this argument
instead of the Dst’s class attribute lang. Can take the values
en
for English orda
for Danish
Returns: - Returns a dictionary containing metadata about the
specified data.
Return type: dict
Todo
- Implement tests
-
get_subjects
(subjects=None, lang=None)[source]¶ Retrieve subjects and sub subjects from Statistics Denmark.
This function allows to retrieve the subjects and subsubjects Statistics Denmark uses to categorize their tables. These subjectsID can be used to only retrieve the tables that is classified with the respective SubjectsID using
get_tables
.Parameters: - subjects (str/list, optional) – If a valid subjectsID is provided it will return the subject’s subsubjects if available. subjects can either be a list of subjectsIDs in string format or a comma seperated string
- lang (str, optional) – If lang is provided it uses this argument
instead of the Dst’s class attribute lang. Can take the values
en
for English orda
for Danish
Returns: Returns a DataFrame with subjects.
Return type: pandas.DataFrame
Examples
The example beneath shows how
get_subjects
is used.>>> from pydst import Dst >>> Dst().get_subjects() active desc hasSubjects id 0 True Population and elections True 02 1 True Living conditions True 05 2 True Education and knowledge True 03 .. ... ... ... .. 10 True Business sectors True 11 11 True Geography, environment and energy True 01 12 True Other True 19
[13 rows x 4 columns]
-
get_tables
(subjects=None, inactive_tables=False, lang=None)[source]¶ Parameters: - inactive_tables (bool, optional) – If True the DataFrame will contain tables that are no longer updated.
- subjects (str/list, optional) – If a valid subjectsID is provided it will return the subject’s subsubjects if available. subjects can either be a list of subjectsIDs in string format or a comma seperated string
- lang (str, optional) – If lang is provided it uses this argument
instead of the Dst’s class attribute lang. Can take the values
en
for English orda
for Danish
Returns: Returns a DataFrame with subjects.
Return type: pandas.DataFrame
Todo
- Check inactive_tables (cerberus validator)
Examples
The example beneath shows how
get_tables
is used.>>> from pydst import Dst >>> Dst().get_tables() active firstPeriod id latestPeriod 0 True 2008Q1 FOLK1A 2018Q2 1 True 2008Q1 FOLK1B 2018Q2 2 True 2008Q1 FOLK1C 2018Q2 ... ... ... ... ... 1958 True 2005 SKOVRG01 2016 1959 True 2005 SKOVRG02 2016 1960 True 2005 SKOVRG03 2016 text unit 0 Population at the first day of the quarter number 1 Population at the first day of the quarter number 2 Population at the first day of the quarter number ... ... ... 1958 Growing stock (physical account) 1,000 m3 1959 Growing stock (monetary account) DKK mio. 1960 Forest area (Kyoto) (physical account) km2 updated variables 0 2018-05-08 08:00:00 [region, sex, age, marital status, time] 1 2018-05-08 08:00:00 [region, sex, age, citizenship, time] 2 2018-05-08 08:00:00 [region, sex, age, ancestry, country of origin... ... ... ... 1958 2017-11-28 08:00:00 [balance items, species of wood, county counci... 1959 2017-11-28 08:00:00 [balance items, species of wood, county counci... 1960 2017-11-28 08:00:00 [balance items, county council district, time]
[1961 rows x 8 columns]
-
get_variables
(table_id, lang=None)[source]¶ DataFrame with variables contained in table_id
Parameters: - table_id (str) – Table ID for the the table you want to retrieve data
- from. –
- lang (str, optional) – If lang is provided it uses this argument
instead of the Dst’s class attribute lang. Can take the values
en
for English orda
for Danish
Returns: Returns a DataFrame with subjects.
Return type: pandas.DataFrame
Todo
- Implement tests
- TableID cerberus validator
-
pydst.utils module¶
-
pydst.utils.
bad_request_wrapper
(r)[source]¶ Raises an error if http error
A wrapper around httperror such that if there is an error message available from Statistics Denmark use this one because it is more descriptiveself.
Parameters: r (requests.models.Response) – Response from the requests library.
-
pydst.utils.
check_lang
(lang)[source]¶ Returns lang if lang is an available languages
Parameters: lang (str) – Can take the values en
for English orda
for Danish
-
pydst.utils.
construct_url
(base, version, app, path, query)[source]¶ Todo
- Test that url result expected url
pydst.validators module¶
-
pydst.validators.
dict_keys_to_comma_str
(dict)[source]¶ Comma seperates dict keys into string
Parameters: dict (dict) – A dictionary. Returns: Comma seperated string of keys. Return type: str
-
pydst.validators.
lang_validator
(lang, valid_langs)[source]¶ Validates if language is correctly specified.
This function validates that lang is contained in valid_langs, and that lang and valid_langs takes the correct types.
Parameters: - lang (str) – Language that is contained in valid_langs. Language must be letters.
- valid_langs (list of str) – A list of valid languages. Each element must be letters.
Returns: None
Module contents¶
Top-level package for pydst.