Branko-Dj on master
[travis][s]: Added update comma… (compare)
babbage
, which provides facts and aggregation endpoints over such DBs.
dump_to_sql
that you can automatically reconstitute a relational database with all of the same constraints and types and inter-table relationships that are specified within a data package? Is there any mechanism for specifying relationships between tables that are originating from resources in different data packages?
@jhofra11_twitter the best way of overviewing datasets from a publisher is to go to a publisher page. Use your username on datahub: https://datahub.io/username
. On that page you can see all published datasets (and only publisher himself/herself can see unlisted/private datasets).
Number of datasets you can publish is unlimited but total size of your datasets is limited (eg, 5GB for basic plan).
I'm having issue trouble not inferring types on load in dataflows when I am loading the file directly without a datapackage. Can someone help me out?
From the documentation, validate option should control that but it isn't working unless I am misunderstanding.
Using dataflows Version: 0.0.32
data.csv
cruise,station,date,time,lat,lon,cast,pump_serial_num
FK160115,4,1/19/2016,20:00,10,204,MP01,12665-01
FK160115,4,1/19/2016,20:00,10,204,MP01,12665-02
FK160115,4,1/19/2016,20:00,10,204,MP01,ML12371-01
FK160115,4,1/19/2016,20:00,10,204,MP01,ML 10820-02
FK160115,4,1/19/2016,20:00,10,204,MP01,ML 11000-01
FK160115,4,1/19/2016,20:00,10,204,MP01,ML 11515-02
FK160115,4,1/19/2016,20:00,10,204,MP01,ML 11934-02
FK160115,5,1/20/2016,18:00,8,156,MP02,12665-01
FK160115,5,1/20/2016,18:00,8,156,MP02,12665-02
FK160115,5,1/20/2016,18:00,8,156,MP02,ML 10820-02
FK160115,5,1/20/2016,18:00,8,156,MP02,ML 11515-02
FK160115,5,1/20/2016,18:00,8,156,MP02,ML 11000-01
FK160115,5,1/20/2016,18:00,8,156,MP02,ML 11491-02
from dataflows import Flow, load, printer
def flow():
flow = Flow(
load('data.csv', format='csv',validate=False,force_strings=True),
printer(num_rows=1),
)
flow.process()
if __name__ == '__main__':
flow()
getting:
data:
# cruise station date time lat lon cast pump_serial_num
(string) (integer) (string) (string) (number) (number) (string) (string)
--- ---------- ----------- ---------- ---------- ---------- ---------- ---------- -----------------
1 FK160115 4 1/19/2016 20:00 10 204 MP01 12665-01
2 FK160115 4 1/19/2016 20:00 10 204 MP01 12665-02
...
66 FK160115 14 2/4/2016 13:00 -4.23 142.23 MP14 ML12371-01