Datasets

Explore and use Datadex datasets in your preferred tools!

πŸ” Explore

You can a sense of how the datasets produced look by exploring them in HuggingFace. With each commit to main, Datadex will push a new version of the datasets as Parquet files.

These are the available Datasets!

πŸ”§ Use

Since the datasets are just Parquet files somewhere. You can use pretty much any tool or framework to explore them. Let’s look at the Spain IPC dataset with Polars.

import polars as pl

df = pl.read_parquet(
    "https://huggingface.co/datasets/datonic/spain_ipc/resolve/refs%2Fconvert%2Fparquet/default/main/0000.parquet"
)
df.sample(4)
shape: (4, 6)
periodo clases indice variacion_mensual variacion_anual variacion_en_lo_que_va_de_ano
date str f64 f64 f64 f64
2018-05-01 "1254 Seguros relacionados con … 96.323 0.0 1.5 0.0
2010-04-01 "0312 Prendas de vestir" 98.85 10.3 -1.2 -4.8
2014-08-01 "0952 Prensa" 86.534 0.6 1.7 0.9
2013-08-01 "0942 Servicios culturales" 99.054 0.3 10.7 1.4
ipc_prensa = df.filter(pl.col("clases") == "0952 Prensa")
import altair as alt

alt.Chart(ipc_prensa).mark_line().encode(
    x=alt.X("periodo", title="Period", axis=alt.Axis(labelAngle=-45)),
    y=alt.Y("indice", title="Index"),
    tooltip=["periodo", "indice"],
).properties(width="container", height=400, title="IPC Prensa Over Time").interactive()