Branko-Dj on master
[travis][s]: Added update comma… (compare)
@vaibhavgeek can you give a bit more detail on the issue with file rename.
To upload a file: just follow the instructions here https://datahub.io/docs/getting-started/publishing-data
Hi there. Just been testing out Google's new Dataset Search and found some spam datasets uploaded to the old datahub.io around 2013.
Where could/should I raise an issue to look at removing spam? Thanks
See screenshot above and visit the page:
Do folks have a favorite easy to use package for visualizing and filtering data that's accessible via data packages? Something that a relative layperson could use?
The perfect thing would be something that already ingests tabular but is made Data Package aware. Right now you can fallback to anything that can ingest csv (which is pretty much all tools). I can suggest some tools for playing with data that would suit (and we could think about how to plugin Data Package support as we have with e.g. pandas etc.
Is there a recommended maximum file size for use with tabular data resources? When running
No there is no limit for tabular data packages. This is a bug with data validate - can you open an issue on https://github.com/datahq/data-cli
I think you can use either route and for bigger packages goodtables may be better (and is used internally).
My other question here is whether any of the files can be chunked/partitioned - frictionlessdata/specs#620
I wanted to updated our datasets on datahub.io/johnsnowlabs
When pushing the dataset this is what I got:
> Error! Max storage for user exceeded plan limit (5000MB)
However the total size of the data that has been uploaded is ~200MB
At the moment I am scraping the list of pages, but it would be great to somehow get an exhaustive list of what has been uploaded. My use case is that we have a list of 217 datasets that we want uploaded. However, only 197 were uploaded. How do we identify the ones there weren't processed?
I went through the data utility logs which seemed to have uploaded everything.
Auth-Token=your_jwtor as query parameter
&jwt=your_jwtto list the unlisted/private datasets as well
| dataset | url | AVAILABLE | |--------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------+-----------| | nasa-temperature-anomalies-by-latitude-bands-time-series-1880-2017 | https://datahub.io/JohnSnowLabs/nasa-temperature-anomalies-by-latitude-bands-time-series-1880-2017/v/2 | No | | chicago-annual-taxpayer-location-list | https://datahub.io/JohnSnowLabs/chicago-annual-taxpayer-location-list/v/2 | No | | nasa-global-temperature-anomalies-time-series-1880-2018 | https://datahub.io/JohnSnowLabs/nasa-global-temperature-anomalies-time-series-1880-2018/v/2 | No | | nj-residents-leading-causes-of-death | https://datahub.io/JohnSnowLabs/nj-residents-leading-causes-of-death/v/2 | No | | uk-properties-for-sale-by-ministry-of-defense | https://datahub.io/JohnSnowLabs/uk-properties-for-sale-by-ministry-of-defense/v/2 | No | | tree-debris-requested-by-311-service | https://datahub.io/JohnSnowLabs/tree-debris-requested-by-311-service/v/2 | No | | tree-trims-requested-by-311-service | https://datahub.io/JohnSnowLabs/tree-trims-requested-by-311-service/v/2 | No | | garbage-carts-requested-by-311-service | https://datahub.io/JohnSnowLabs/garbage-carts-requested-by-311-service/v/2 | No | | pot-holes-reported-by-311-service | https://datahub.io/JohnSnowLabs/pot-holes-reported-by-311-service/v/2 | No | | eicu-collaborative-research-admissions-summary-statistics | https://datahub.io/JohnSnowLabs/eicu-collaborative-research-admissions-summary-statistics/v/1 | Yes | | chicago-taxi-trips | https://datahub.io/JohnSnowLabs/chicago-taxi-trips/v/2 | No | | chicago-beach-weather-stations-automated-sensors | https://datahub.io/JohnSnowLabs/chicago-beach-weather-stations-automated-sensors/v/2 | No | | chicago-beach-water-quality-automated-sensors-report | https://datahub.io/JohnSnowLabs/chicago-beach-water-quality-automated-sensors-report/v/2 | No | | all-countries-latitude-longitude | https://datahub.io/JohnSnowLabs/all-countries-latitude-longitude/v/4 | No | | estimates-emissions-of-co2-at-country-and-global-level | https://datahub.io/JohnSnowLabs/estimates-emissions-of-co2-at-country-and-global-level/v/2 | No | | energy-consumption-by-mode-of-transportation-and-type-of-energy | https://datahub.io/JohnSnowLabs/energy-consumption-by-mode-of-transportation-and-type-of-energy/v/2 | No | | relocated-vehicles-in-chicago-last-90-days | https://datahub.io/JohnSnowLabs/relocated-vehicles-in-chicago-last-90-days/v/1 | No | | nys-english-and-mathematics-exam | https://datahub.io/JohnSnowLabs/nys-english-and-mathematics-exam/v/2 | No | | schools-for-life-safety-evaluations | https://datahub.io/JohnSnowLabs/schools-for-life-safety-evaluations/v/2 | No |
| food-affordability-for-households-led-by-females | https://datahub.io/JohnSnowLabs/food-affordability-for-households-led-by-females/v/2 | No | | chicago-business-licenses | https://datahub.io/JohnSnowLabs/chicago-business-licenses/v/1 | No | | city-population-annual-time-series | https://datahub.io/JohnSnowLabs/city-population-annual-time-series/v/3 | No | | bloomington-animal-care-and-control-adopted-animals | https://datahub.io/JohnSnowLabs/bloomington-animal-care-and-control-adopted-animals/v/2 | No | | legally-operating-businesses | https://datahub.io/JohnSnowLabs/legally-operating-businesses/v/2 | No | | cta-ridership-bus-routes | https://datahub.io/JohnSnowLabs/cta-ridership-bus-routes/v/1 | Yes | | most-popular-baby-names-by-gender-and-mother-ethnic-group | https://datahub.io/JohnSnowLabs/most-popular-baby-names-by-gender-and-mother-ethnic-group/v/2 | No | | eicu-collaborative-research-available-tables-and-data | https://datahub.io/JohnSnowLabs/eicu-collaborative-research-available-tables-and-data/v/1 | Yes | | nj-traffic-counts-data | https://datahub.io/JohnSnowLabs/nj-traffic-counts-data/v/2 | No | | austin-adult-and-children-vaccinations | https://datahub.io/JohnSnowLabs/austin-adult-and-children-vaccinations/v/2 | No | | euro-4-cars-emissions-traded-on-uk-market-2000-2012 | https://datahub.io/JohnSnowLabs/euro-4-cars-emissions-traded-on-uk-market-2000-2012/v/2 | No | | lobbyist-agency-report | https://datahub.io/JohnSnowLabs/lobbyist-agency-report/v/2 | No | | windsor-transit-bus-stops | https://datahub.io/JohnSnowLabs/windsor-transit-bus-stops/v/2 | No | | omha-receipts-for-fiscal-year-2011-2013 | https://datahub.io/JohnSnowLabs/omha-receipts-for-fiscal-year-2011-2013/v/2 | No | | impaired-driving-death-rate-by-age-and-race | https://datahub.io/JohnSnowLabs/impaired-driving-death-rate-by-age-and-race/v/2 | No | | chicago-red-light-and-speed-camera-violations | https://datahub.io/JohnSnowLabs/chicago-red-light-and-speed-camera-violations/v/2 | No | | us-employment-and-unemployment-rates | https://datahub.io/JohnSnowLabs/us-employment-and-unemployment-rates/v/2 | No | | chicago-affordable-rental-housing-developments | https://datahub.io/JohnSnowLabs/chicago-affordable-rental-housing-developments/v/2 | No | | vehicle-occupant-safety-data | https://datahub.io/JohnSnowLabs/vehicle-occupant-safety-data/v/2 | No | | chicago-traffic-tracker | https://datahub.io/JohnSnowLabs/chicago-traffic-tracker/v/2 | No | | imf-world-economic-outlook-database | https://datahub.io/JohnSnowLabs/imf-world-economic-outlook-database/v/2 | No |
| chicago-bike-racks-map | https://datahub.io/JohnSnowLabs/chicago-bike-racks-map/v/2 | No | | us-states-and-territories | https://datahub.io/JohnSnowLabs/us-states-and-territories/v/2 | No | | chicago-alternative-fuel-locations | https://datahub.io/JohnSnowLabs/chicago-alternative-fuel-locations/v/2 | No |
Folks, I just spent a couple of hours uploading 43 datasets. It was a very frustrating to find that only 3 of those datasets made it to the datahub website, even though the data utility uploaded everything without an issue. Here are the results:
@MAliNaqvi Hi Ali! As I can see all datasets was uploaded successfully, however, most of them have validation/processing issues. You need to be logged in to see those errors. I know that you’re using an org account so the best way to check would be to pass your JWT within query params, e.g., try this
https://datahub.io/JohnSnowLabs/chicago-traffic-tracker/v/2?jwt=<your-jwt> so that you are able to see FAILED dataset page.