These are chat archives for frictionlessdata/chat

5th
Jul 2016
Dean Rather
@deanrather
Jul 05 2016 01:45
Hi! I'm new here.
Dean Rather
@deanrather
Jul 05 2016 01:50
I'm designing a RESTful API to allow definition, creation, and editing of data packages. It's following http://jsonapi.org/ conventions at the moment, with columns, rows, and cells each being separate (related) resources. So you could PATCH /datapackage/1/records/2/cells/3 to modify the 3rd column of the 2nd row of a csv. I was just wondering if there's any suggested way to structure such an API; and whether you'd consider it bad practice to give each cell a resource of it's own... (a 100x100 csv has 10k cell resources...)
I'm considering ditching all those relationships entirely, and just having a single resource type (datapackage) which has only two properties: data and schema. But then how would the API look to modify a single cell? You'd need to re-upload the entire CSV each time...
Ivo Jimenez
@ivotron
Jul 05 2016 06:52
@ndkv follow-up question to my previous one. How do you deal with big datasets? Can I "install" a dataset without having to retrieve its data? i.e. I'd like to keep a reference of the dataset I'm using (the datapackage.json file) but not to download it
Adrià Mercader
@amercader
Jul 05 2016 09:19
@deanrather I haven't been directly involved in data packages for a while, so definitely wait for other's feedback. But in general my feeling is that REST APIs for tabular resources only work well for individual edits, eg an online editor where you manually edit the value of a cell. For bulk operations I think it makes more sense to have an endpoint that accepts lists or objects (against a know schema of course). I'm personally familiar with the CKAN DataStore API (http://docs.ckan.org/en/latest/maintaining/datastore.html#the-datastore-api), where you have datastore_create or datastore_upsert endpoints where you can pass. I think that with datastore_upsert you can pass only the fields (cells) you want to update so that would cover your use case of editing a single cell. Of course this is more cumbersome than a REST call so it really depends on what's your main use case
as I said I don't know if the existing frictionless data/ data packages libraries implement this sort of API
Dean Rather
@deanrather
Jul 05 2016 23:12
@amercader Thanks for your response! I've been reading through the repo's on github and haven't seen any suggestion of how to "edit" datastores, certainly not a single cell.. only how to create/read/validate them. I'm thinking I'll go down the route of having two representations of my data: 1) a readonly datastore mechanism, and 2) a read/write REST API.