Our schema allows extra headers and missing non-required headers, so the csv could have less (or more) column than the defined schema.
Traceback (most recent call last): File "test.py", line 8, in <module> report = validate("datapackage.json", checks=["structure", "schema", "foreign-key"], order_fields=True, infer_fields=False) File "/home/didiez/anaconda3/lib/python3.7/site-packages/goodtables/validate.py", line 80, in validate report = inspector.inspect(source, **options) File "/home/didiez/anaconda3/lib/python3.7/site-packages/goodtables/inspector.py", line 82, in inspect table_warnings, table_report = task.get() File "/home/didiez/anaconda3/lib/python3.7/multiprocessing/pool.py", line 657, in get raise self._value File "/home/didiez/anaconda3/lib/python3.7/multiprocessing/pool.py", line 121, in worker result = (True, func(*args, **kwds)) File "/home/didiez/anaconda3/lib/python3.7/site-packages/goodtables/inspector.py", line 200, in __inspect_table success = prepare_func(stream, schema, extra) File "/home/didiez/anaconda3/lib/python3.7/site-packages/goodtables/contrib/checks/foreign_key.py", line 48, in prepare current_resource_name=extra['resource-name']) File "/home/didiez/anaconda3/lib/python3.7/site-packages/goodtables/contrib/checks/foreign_key.py", line 116, in _get_relations relations[resource_name] = resource.read(keyed=True) File "/home/didiez/anaconda3/lib/python3.7/site-packages/datapackage/resource.py", line 377, in read foreign_keys_values=foreign_keys_values, **options) File "/home/didiez/anaconda3/lib/python3.7/site-packages/tableschema/table.py", line 353, in read for count, row in enumerate(rows, start=1): File "/home/didiez/anaconda3/lib/python3.7/site-packages/tableschema/table.py", line 215, in iter for row_number, headers, row in iterator: File "/home/didiez/anaconda3/lib/python3.7/site-packages/tableschema/table.py", line 509, in builtin_processor row, row_number=row_number, exc_handler=exc_handler) File "/home/didiez/anaconda3/lib/python3.7/site-packages/tableschema/schema.py", line 266, in cast_row error_data=keyed_row) File "/home/didiez/anaconda3/lib/python3.7/site-packages/tableschema/helpers.py", line 90, in default_exc_handler raise exc datapackage.exceptions.CastError: Row length 5 doesn't match fields count 6 for row "2"
If we only check 'structure' and 'schema' everything works as expected.
I could not find any reference in the docs about reading (and casting) csv files with fields not defined in the schema o missing fields (non-required column in schema).
Should I open an issue or it's a known limitation when reading csv files with goodtables?
No such file or directory: 'inline'was masking the actual error. To read from a file stream it must be binary, you must specify the format and it cannot live in a tmp path. Unfortunately the logic behind
return urlparse(source).scheme == '' and not os.path.isfile(source)- masks the true error.
(node:66) UnhandledPromiseRejectionWarning: Error: Can't create a job on API. Reason: "Error: Request failed with status code 403"
goodtables-jsis a goodtables.io wrapper so something wrong with the API endpoint
I guess I found a bug on the DataPackage Viewer (https://data.okfn.org/tools/view), but not sure where to report it. With GitHub repo issue should I go?
While the specification allows the resource path to either be a file on local disk or a remote resource (and most implementations seems to check if is remote resource if path start with 'http'), the DataPackage Viewer on the website don't do this check and assume is always a local file.
Here an example where it fails https://data.okfn.org/tools/view?url=https%3A%2F%2Fgithub.com%2Fcovid-taskforce-cplp%2Fdados-v1, The download path for 'covid-casos-brasil' should be 'https://brasil.io/dataset/covid19/caso?format=csv', but the download button shows https://raw.github.com/covid-taskforce-cplp/dados-v1/master/https://brasil.io/dataset/covid19/caso?format=csv
Is there a preferred copyright notice for FD contributions? I’ve seen all across the board for different DataPackage/TableSchema implementations
Can you open an issue in the forum https://github.com/frictionlessdata/forum/issues - and we can answer there.
Hi everyone, I'm happy to share that I recently joined the Frictionless Data team as a Developer Evangelist. My role involves spreading the word about Frictionless Data and encouraging community involvement. I'm always open to help and have discussions about Frictionless Data. https://www.datopian.com/blog/2020/03/20/joining-the-frictionless-data-team/
@lauragift21 a big welcome!
Hi, I'm excited to share we'll be hosting a Frictionless Data virtual hangout on 20th April. 5 PM (CET) This is a great opportunity to know what's been going on in the Frictionless Data community. Can't wait to see you there! :slight_smile:
Meeting ID: 871 8330 1345