Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Activity
  • May 18 14:51
    lwinfree opened #729
  • May 18 11:12
    ivbeg opened #792
  • May 17 18:29
    Rethink2050 opened #791
  • May 17 12:25
    sapetti9 edited #727
  • May 17 07:39
    roll labeled #728
  • May 16 16:31
    lwinfree assigned #728
  • May 16 16:31
    lwinfree opened #728
  • May 16 16:26
    lwinfree closed #723
  • May 16 16:26
    lwinfree commented #723
  • May 15 06:33
    roll edited #790
  • May 15 06:32
    roll opened #790
  • May 09 08:00
    roll labeled #375
  • May 09 07:59
    roll unlabeled #375
  • May 05 17:10
    sapetti9 edited #727
  • May 05 17:09
    sapetti9 labeled #727
  • May 05 17:09
    sapetti9 assigned #727
  • May 05 17:09
    sapetti9 opened #727
  • May 05 17:08
    sapetti9 closed #725
  • May 05 17:08
    sapetti9 edited #725
  • May 05 17:08
    sapetti9 edited #725
Vitor Baptista
@vitorbaptista
@loleg Good luck with the new project! Do you plan on publishing your open dev log somewhere?
Oleg Lavrovsky
@loleg
@vitorbaptista absolutely, it'll be on GitHub at least in raw form later today
Vitor Baptista
@vitorbaptista
@loleg Cool! Please let me know when it's online
Oleg Lavrovsky
@loleg
@vitorbaptista thanks for your enthusiasm :) https://github.com/loleg/devlog/tree/master/content
Vitor Baptista
@vitorbaptista
:tada: :smile:
Oleg Lavrovsky
@loleg
Are there notes anywhere of the roots of the standard, specifically how much it owes to (and potentially influences the future of) https://github.com/ckan/ckan/blob/master/ckan/logic/schema.py#L206 ?
Oleg Lavrovsky
@loleg
(or is this a touchy topic I should leave to later discussion..)
Stephen Gates
@Stephen-Gates

Hi, I'm looking for test datapackage.zip files. I came across https://github.com/frictionlessdata/testsuite-extended and https://github.com/frictionlessdata/example-data-packages but these didn't help. I couldn't find datapackage.zip files on datahub.io either.

Any suggestions on a source for data package test data and if not, where is the best place to contribute these?

Rufus Pollock
@rufuspollock

Are there notes anywhere of the roots of the standard, specifically how much it owes to (and potentially influences the future of) https://github.com/ckan/ckan/blob/master/ckan/logic/schema.py#L206 ?

Not a touchy topic at all. If you go back to the pre-history of frictionless data in 2007 or so then yes: ckan metadata was partially inspired by python packaging and so was first "dpm" (data package manager). In fact, as you may know, CKAN was originally intended to act like pypi or cpan cran etc.

Over time, the source of inspiration of data packages has shifted a bit towards more recent packaging systems like node + package.json (pypi was probably not the best initial inspiration).

This is something that would probably be worth a discuss.okfn.org question so we can write up there for posterity :-)

Oleg Lavrovsky
@loleg
Got it, will do, thanks Rufus!
Meiran Zhiyenbayev
@Mikanebu

Core Data: Essential Datasets for Data Wranglers and Data Scientists

This post introduces you to the Core Data, presents a couple of examples and shows you how you can access and use core data easily from your own tools and systems including R, Python, Pandas and more.
http://datahub.io/blog/core-data-essential-datasets-for-data-wranglers-and-data-scientists

This is a blog post. To read full text, please, follow the link above.

Jeremy Palmer
@palmerj
Hi All!
I was wondering what it would take to be involved in the next development of the specifications. In particular for the Tabular Data Package
For the Data Service that we manage we have been looking for a standard way of better describing CSV metadata and the package schema defined here looks great.
The main issue is added full spatial data type support
Jeremy Palmer
@palmerj
At the moment points and GeoJSON support is there in a basic format, but issues like spatial extent of dataset, spatial reference system and vector geometry type (e.g polygon, point, linestring) need to be added to the schema to make the schema work properly in the geospatial world
Thanks!
Stephen Gates
@Stephen-Gates
Hi @palmerj, It's great that your keen to contribute to the spatial aspects of Data Packages. I started contributing by commenting or raising issues on github, discussions on the forum or here. I wrote this guide to try progress spatial data in packages. You may be interested in frictionlessdata/specs#86
Stephen Gates
@Stephen-Gates
Given @palmerj's question about contributing, I notice that many frictionless data repositories don't have the recommended community files. In a the yet to be accepted PR okfn/licenses#57 I added a code of conduct, contributing, and other community files. Is it appropriate to add these to the repo or are there standard templates to apply to OK repositories?
Rufus Pollock
@rufuspollock

Hi All!

Welcome!

I was wondering what it would take to be involved in the next development of the specifications. In particular for the Tabular Data Package

You've taken the first step! We welcome contributions and new curators of the specifications.

At the moment points and GeoJSON support is there in a basic format, but issues like spatial extent of dataset, spatial reference system and vector geometry type (e.g polygon, point, linestring) need to be added to the schema to make the schema work properly in the geospatial world

We'd really welcome your help here - be it on improving geo in the tabular spec or on the separate WIP geo spec.

Rufus - co-lead curator of the Frictionless Data Specs

@Stephen-Gates first a huge appreciation of your ongoing contributions here -- you are a definitely a candiate for curator :-)

Given @palmerj's question about contributing, I notice that many frictionless data repositories don't have the recommended community files. In a the yet to be accepted PR okfn/licenses#57 I added a code of conduct, contributing, and other community files. Is it appropriate to add these to the repo or are there standard templates to apply to OK repositories?

@Stephen-Gates i'd say generally yes. For all of OKi stuff i'd recommending first suggesting on okfn/chat or the forum. For frictionless if you could do a draft (or pull out your exiting one into a PR and we can review).

Mamadou Diagne
@dofbi
@Mikanebu @rufuspollock we have an organisation with some data packages, I would like to publish data with the name of the organisation not my username like this https://datahub.io/organisation_name/dataset thank you
Rufus Pollock
@rufuspollock
@genova can you re-ask this in gitter.im/datahubio/chat - as that is the primary chat channel for datahub questions :-)
Meiran Zhiyenbayev
@Mikanebu
@genova Great! At the moment organization accounts are manually provided. Lets have a chat or schedule a short call so I can assist you on creating account.
Oleg Lavrovsky
@loleg
Latest devlog posted from the Julia project, comments welcome https://github.com/loleg/devlog/blob/master/content/2017-11-06-Community.md
Stephen Gates
@Stephen-Gates
Matthew Thompson
@cblop
Hi all, I'm reasonably new to this remote style of development so please bear with all of my questions!
I'm working on the Clojure implementations of tableschema and datapackage. I'm starting off with the code for the type casting in tableschema-clj (doing it with clojure.spec as I go). Should I be pushing to the repo every time I get a bunch of tests passing, or get most of it working offline first and then push it all in one huge update?
If I push the code bit-by-bit, then obviously you can look at it, but nobody will be able to actually use the code for a while
Arvi Leino
@arvileino
Link http://data.okfn.org/tools/create to datapackage.json creator tool does not work.
Rufus Pollock
@rufuspollock

Link http://data.okfn.org/tools/create to datapackage.json creator tool does not work.

Thanks for spotting and we'll get a redirect implemented for that ...

roll
@roll
Hi @cblop on this stage just whatever works best for you
Arvi Leino
@arvileino
Is it ok to have CSV, ODS and XLSX in same datapackage.json? https://dev.turku.fi/datapackages/tietojarjestelmaluettelo/datapackage.json
Stephen Gates
@Stephen-Gates
@arvileino given http://frictionlessdata.io/specs/data-resource/#optional-properties I don’t see why not. I’ve mixed csv and tsv in a single package. Out of interest, is it the same data in different formats or related data?
Arvi Leino
@arvileino
Same data in different formats. Thanks!
Rufus Pollock
@rufuspollock
@arvileino that's fine - as @Stephen-Gates says multiple data in different formats is no problem.
Meiran Zhiyenbayev
@Mikanebu

Import online data files directly with scheduling

Users can now import online data files directly into the DataHub using the data command line tool – and setup scheduled re-imports at the same time.
https://datahub.io/blog/2017-11-14-import-online-data-files-directly-with-scheduling

This is a blog post. To read full text, please, follow the link above.

Kenji
@Kenji-K
Hi, I understand if you're busy to respond to my earlier question, but I wanna try again. Is there any resource on the relationship between the frictionless data specs and the dublin core metadata initiative? Am I correct in understanding that I can extend frictionlessdata with dublin core?
hgossler
@hgossler
Is there a certain way the frictionlessdata specs/project should be cited? Just the homepage, i.e. http://frictionlessdata.io/ ?
Paul Walsh
@pwalsh

@Kenji-K hi. There is some discussion on a pattern of name spacing as a way to extend with other metadata schemes, such as Dublin Core, but there is no specific resource on this

frictionlessdata/specs#403

Hi @hgossler that sounds correct, or a specific spec URL if relevant.
Kenji
@Kenji-K
@pwalsh Excellent, thanks.
Paul Walsh
@pwalsh
@Kenji-K we invite contributions in this area, if it is of interest to you
Paul Walsh
@pwalsh

@serahrono @jobarratt

I suggest we close down the frictionlessdata/project repo. It is only used for the issue tracker at https://github.com/frictionlessdata/project/issues but #359 should be on the specs issue tracker, #360 is a duplicate of #332 which should be on the website issue tracker, and #358 should also be on the website issue tracker. @serahrono can you do this as part of your work on the website please.

Serah Njambi Rono
@serahrono
:+1:
Meiran Zhiyenbayev
@Mikanebu
Hey there! I am using tabulator-py for remote source in xls format, source location is https://www.eia.gov/dnav/pet/hist_xls/RBRTEd.xls. Somehow it converts date incorrecly from5/20/1987 to 31917.0. Is there anything I need to do to make it valid date?
roll
@roll
@Mikanebu could you please create an issue? Or may be even PR? Excel's dates related - frictionlessdata/implementations#23
Meiran Zhiyenbayev
@Mikanebu
@roll ok
Eoghan Ó Carragáin
@eocarragain
@pwalsh @Kenji-K how about json-ld for this?