These are chat archives for frictionlessdata/chat

8th
Aug 2018
roll
@roll
Aug 08 2018 08:47
@vincentchevrier Hi, It's an interesting question. For now, I think the way to do it is manually saving data to the disc and recreating a data package. But it could be a good candidate to become the lib's feature (in-memory -> files)
roll
@roll
Aug 08 2018 09:02
from datapackage import Package
from tabulator import Stream

DESCRIPTOR = {
  'resources': [
    {
      'name': 'cities',
      'profile': 'tabular-data-resource',
      'data': [
        ['name', 'country'],
        ['London', 'England'],
        ['Madrid', 'Spain'],
      ],
      'schema': {
        'fields': [
          {'name': 'name', 'type': 'string'},
          {'name': 'country', 'type': 'string'},
        ],
      },
    },
  ],
}

package = Package(DESCRIPTOR)
resource = package.get_resource('cities')
with Stream(resource.iter, headers=resource.schema.headers) as stream:
    stream.save('cities.csv')
package.descriptor['resources'][0]['path'] = 'cities.csv'
del package.descriptor['resources'][0]['data']
package.commit()
package.save('datapackage.json')
I think ability to resource.save('file.csv') (save resource as csv) is a very good candidate for PR. It was missing functionality and it's very easy to add
vincentchevrier
@vincentchevrier
Aug 08 2018 14:17
Thanks @roll for the workaround. resource.save would be great! If you do add resource.save, triggering it in the package.save function with a keyword argument would be really nice. Thanks!
vincentchevrier
@vincentchevrier
Aug 08 2018 15:13

Another question on this approach and on formats in general. In this minimal example:

from datapackage import Package

DESCRIPTOR = {
  'resources': [
    {
      'name': 'teams',
      'data': [
        ['id'],
        [1.0],
        [2.0]
      ],
      'schema': {
        'fields': [
          {'name': 'id', 'type': 'integer'}
        ]
      }
    }
  ]
}

# Check
package = Package(DESCRIPTOR)
teams = package.get_resource('teams')
teams.read()

theteams.read() raises a CastError triggered by the 1.0 float. Coming from a Python background, where ints play well with floats, this is a bit unexpected. If it is represented a 1 or as '1' there are no CastErrors. I was just surprised that in casting to integer a string would be accepted but a float rejected. Is this the intended behavior?