Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Activity
  • May 18 14:51
    lwinfree opened #729
  • May 18 11:12
    ivbeg opened #792
  • May 17 18:29
    Rethink2050 opened #791
  • May 17 12:25
    sapetti9 edited #727
  • May 17 07:39
    roll labeled #728
  • May 16 16:31
    lwinfree assigned #728
  • May 16 16:31
    lwinfree opened #728
  • May 16 16:26
    lwinfree closed #723
  • May 16 16:26
    lwinfree commented #723
  • May 15 06:33
    roll edited #790
  • May 15 06:32
    roll opened #790
  • May 09 08:00
    roll labeled #375
  • May 09 07:59
    roll unlabeled #375
  • May 05 17:10
    sapetti9 edited #727
  • May 05 17:09
    sapetti9 labeled #727
  • May 05 17:09
    sapetti9 assigned #727
  • May 05 17:09
    sapetti9 opened #727
  • May 05 17:08
    sapetti9 closed #725
  • May 05 17:08
    sapetti9 edited #725
  • May 05 17:08
    sapetti9 edited #725
Martín n
@martinszy
Hello, I'm testing datapackages-pipeline
Rufus Pollock
@rufuspollock
@martinszy hey there, that's great!
Martín n
@martinszy
I have a couple of questions:
1) I have changing filenames, is there any way to use wildcards in add_resource?
2) Is is a bad idea to have a dump.to_ckan action?
3) Does ckan handle datapackages yet?
@amercader and @brew can answer any questions you have about the CKAN integrations next week
Martín n
@martinszy
thanks, I'll check that out
Martín n
@martinszy
For 1) I'm thinking: either create a pre-process that generates the YAML file or modify datapackages-pipeline in order to allow for custom_resource_adders or something like that, that listen for multiple files and then trigger the rest of the process... but I've not analyzed this properly yet
roll
@roll
@Stephen-Gates Could you please try again?
Stephen Gates
@Stephen-Gates
@roll working perfectly (see Job History). The PR okfn/licenses#57 is now good to go. Thanks so much for your help.
Brook Elgie
@brew
@martinszy Don't forget you can generate pipelines dynamically using a "Generator": https://github.com/frictionlessdata/datapackage-pipelines/#plugins-and-source-descriptors Perhaps this would help with the flexibility you're after?
Martín n
@martinszy
@brew awesome! I'll check it out!!
Oleg Lavrovsky
@loleg
Good morning & greetings from a park bench (47.37226921, 8.54668868) next to the office of data.stadt-zurich.ch, where I'm starting my new project with you today.
I'm starting with a review of the Python implementation, which I started already testing last week in combination with a tool we use to work with data providers & teams at hackathons. Thanks @callmealien @pwalsh @jobarratt for your help getting plugged in so far! I'll be keeping an open dev log, and you can mention me here anytime if you have questions or suggestions.
Vitor Baptista
@vitorbaptista
@loleg Good luck with the new project! Do you plan on publishing your open dev log somewhere?
Oleg Lavrovsky
@loleg
@vitorbaptista absolutely, it'll be on GitHub at least in raw form later today
Vitor Baptista
@vitorbaptista
@loleg Cool! Please let me know when it's online
Oleg Lavrovsky
@loleg
@vitorbaptista thanks for your enthusiasm :) https://github.com/loleg/devlog/tree/master/content
Vitor Baptista
@vitorbaptista
:tada: :smile:
Oleg Lavrovsky
@loleg
Are there notes anywhere of the roots of the standard, specifically how much it owes to (and potentially influences the future of) https://github.com/ckan/ckan/blob/master/ckan/logic/schema.py#L206 ?
Oleg Lavrovsky
@loleg
(or is this a touchy topic I should leave to later discussion..)
Stephen Gates
@Stephen-Gates

Hi, I'm looking for test datapackage.zip files. I came across https://github.com/frictionlessdata/testsuite-extended and https://github.com/frictionlessdata/example-data-packages but these didn't help. I couldn't find datapackage.zip files on datahub.io either.

Any suggestions on a source for data package test data and if not, where is the best place to contribute these?

Rufus Pollock
@rufuspollock

Are there notes anywhere of the roots of the standard, specifically how much it owes to (and potentially influences the future of) https://github.com/ckan/ckan/blob/master/ckan/logic/schema.py#L206 ?

Not a touchy topic at all. If you go back to the pre-history of frictionless data in 2007 or so then yes: ckan metadata was partially inspired by python packaging and so was first "dpm" (data package manager). In fact, as you may know, CKAN was originally intended to act like pypi or cpan cran etc.

Over time, the source of inspiration of data packages has shifted a bit towards more recent packaging systems like node + package.json (pypi was probably not the best initial inspiration).

This is something that would probably be worth a discuss.okfn.org question so we can write up there for posterity :-)

Oleg Lavrovsky
@loleg
Got it, will do, thanks Rufus!
Meiran Zhiyenbayev
@Mikanebu

Core Data: Essential Datasets for Data Wranglers and Data Scientists

This post introduces you to the Core Data, presents a couple of examples and shows you how you can access and use core data easily from your own tools and systems including R, Python, Pandas and more.
http://datahub.io/blog/core-data-essential-datasets-for-data-wranglers-and-data-scientists

This is a blog post. To read full text, please, follow the link above.

Jeremy Palmer
@palmerj
Hi All!
I was wondering what it would take to be involved in the next development of the specifications. In particular for the Tabular Data Package
For the Data Service that we manage we have been looking for a standard way of better describing CSV metadata and the package schema defined here looks great.
The main issue is added full spatial data type support
Jeremy Palmer
@palmerj
At the moment points and GeoJSON support is there in a basic format, but issues like spatial extent of dataset, spatial reference system and vector geometry type (e.g polygon, point, linestring) need to be added to the schema to make the schema work properly in the geospatial world
Thanks!
Stephen Gates
@Stephen-Gates
Hi @palmerj, It's great that your keen to contribute to the spatial aspects of Data Packages. I started contributing by commenting or raising issues on github, discussions on the forum or here. I wrote this guide to try progress spatial data in packages. You may be interested in frictionlessdata/specs#86
Stephen Gates
@Stephen-Gates
Given @palmerj's question about contributing, I notice that many frictionless data repositories don't have the recommended community files. In a the yet to be accepted PR okfn/licenses#57 I added a code of conduct, contributing, and other community files. Is it appropriate to add these to the repo or are there standard templates to apply to OK repositories?
Rufus Pollock
@rufuspollock

Hi All!

Welcome!

I was wondering what it would take to be involved in the next development of the specifications. In particular for the Tabular Data Package

You've taken the first step! We welcome contributions and new curators of the specifications.

At the moment points and GeoJSON support is there in a basic format, but issues like spatial extent of dataset, spatial reference system and vector geometry type (e.g polygon, point, linestring) need to be added to the schema to make the schema work properly in the geospatial world

We'd really welcome your help here - be it on improving geo in the tabular spec or on the separate WIP geo spec.

Rufus - co-lead curator of the Frictionless Data Specs

@Stephen-Gates first a huge appreciation of your ongoing contributions here -- you are a definitely a candiate for curator :-)

Given @palmerj's question about contributing, I notice that many frictionless data repositories don't have the recommended community files. In a the yet to be accepted PR okfn/licenses#57 I added a code of conduct, contributing, and other community files. Is it appropriate to add these to the repo or are there standard templates to apply to OK repositories?

@Stephen-Gates i'd say generally yes. For all of OKi stuff i'd recommending first suggesting on okfn/chat or the forum. For frictionless if you could do a draft (or pull out your exiting one into a PR and we can review).

Mamadou Diagne
@dofbi
@Mikanebu @rufuspollock we have an organisation with some data packages, I would like to publish data with the name of the organisation not my username like this https://datahub.io/organisation_name/dataset thank you
Rufus Pollock
@rufuspollock
@genova can you re-ask this in gitter.im/datahubio/chat - as that is the primary chat channel for datahub questions :-)
Meiran Zhiyenbayev
@Mikanebu
@genova Great! At the moment organization accounts are manually provided. Lets have a chat or schedule a short call so I can assist you on creating account.
Oleg Lavrovsky
@loleg
Latest devlog posted from the Julia project, comments welcome https://github.com/loleg/devlog/blob/master/content/2017-11-06-Community.md
Stephen Gates
@Stephen-Gates
Matthew Thompson
@cblop
Hi all, I'm reasonably new to this remote style of development so please bear with all of my questions!
I'm working on the Clojure implementations of tableschema and datapackage. I'm starting off with the code for the type casting in tableschema-clj (doing it with clojure.spec as I go). Should I be pushing to the repo every time I get a bunch of tests passing, or get most of it working offline first and then push it all in one huge update?
If I push the code bit-by-bit, then obviously you can look at it, but nobody will be able to actually use the code for a while
Arvi Leino
@arvileino
Link http://data.okfn.org/tools/create to datapackage.json creator tool does not work.
Rufus Pollock
@rufuspollock

Link http://data.okfn.org/tools/create to datapackage.json creator tool does not work.

Thanks for spotting and we'll get a redirect implemented for that ...

roll
@roll
Hi @cblop on this stage just whatever works best for you
Arvi Leino
@arvileino
Is it ok to have CSV, ODS and XLSX in same datapackage.json? https://dev.turku.fi/datapackages/tietojarjestelmaluettelo/datapackage.json
Stephen Gates
@Stephen-Gates
@arvileino given http://frictionlessdata.io/specs/data-resource/#optional-properties I don’t see why not. I’ve mixed csv and tsv in a single package. Out of interest, is it the same data in different formats or related data?
Arvi Leino
@arvileino
Same data in different formats. Thanks!
Rufus Pollock
@rufuspollock
@arvileino that's fine - as @Stephen-Gates says multiple data in different formats is no problem.