Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Repo info
Activity
    Joris Van den Bossche
    @jorisvandenbossche
    Will try to comment on the issue soon!
    Nick
    @NapsterInBlue

    Hi there!

    Was wondering how the convex_hull chart was generated on this page of the docs. I understand the .plot() method after calling .convex_hull on the GeoSeries, but can't seem to replicate the coloring and am not seeing any obvious methods to do so in the API.

    Thanks

    Martin Fleischmann
    @martinfleis
    I guess it was generated quite a while ago when geopandas and matplotlib had different defaults.
    Joris Van den Bossche
    @jorisvandenbossche
    Yes, Martin is correct.
    You can more or less replicate it with:
    df = geopandas.read_file(geopandas.datasets.get_path('nybb'))  
    
    df.convex_hull.plot(cmap='Set1', edgecolor='k', alpha=0.5)
    but we should update those docs to be actual running code
    Nick
    @NapsterInBlue

    Gooooootcha. I didn't realize that the cmap part was coloring more or less randomly. I was interpreting the coloration to be on some sort of scale, based on a column of the dataset-- which, looking at it now, it's obviously just categorical and not trying to put red, grey, green, yellow, and brown on some sort of gradient, haha

    Thanks for the responses!

    arredond
    @arredond

    Hey everbody! After a good half-a-year, I've finally gotten back to tackling this issue, related to appending data when writing with to_file: geopandas/geopandas#1004

    Adding an optional parameter mode to the to_file method is trivial enough, but I'm having some doubts righting the tests. Right now, there's two kinds of tests for to_file:

    1. An "end to end" test (test_to_file_roundtrip) in test_file_geom_types_drivers.py
    2. Many different tests in to_file.py, that are tested for just three Fiona (OGR) drivers: GPKG, GeoJSON and ESRI Shapefile

    I suppose I should add a single unit test in the second one, first writing to a file and then appending to it. However, Fiona only supports appending for ESRI Shapefile (of the three I mentioned), so i should probably check the fiona.supported_drivers first and only try to append if the driver is supported.

    Would this be the right approach? I've never written unit tests before and am don't want to screw things up too bad :P
    Joris Van den Bossche
    @jorisvandenbossche
    There was actually a PR opened for this just a few days ago: geopandas/geopandas#1229
    arredond
    @arredond
    Oh, great! Then it seems like there's a good opportunity to learn looking at those tests :)
    Joris Van den Bossche
    @jorisvandenbossche
    So a PR is not needed anymore (sorry about that if you already started on it!), but feedback on the PR is certainly welcome
    arredond
    @arredond
    I'll take a look :) Thanks for pointing me in the right direction!
    Joris Van den Bossche
    @jorisvandenbossche
    Yeah, the approach that was taken there is to write to a file and then append to it, and then verify this was done correctly by reading it back in and check the result
    arredond
    @arredond
    Hmm, is the fact that the driver is hardcoded to ESRI Shapefile a good approach? It is the most common of the geospatial formats that support append mode in Fiona, but if append mode were supported for GPKG or GeoJSON in the future, it'd be nice to automatically test that, right?
    {k:v for k,v in fiona.supported_drivers.items() if 'a' in v}
    
    {'BNA': 'raw',
     'DXF': 'raw',
     'CSV': 'raw',
     'ESRI Shapefile': 'raw',
     'GML': 'raw',
     'GPX': 'raw',
     'GPSTrackMaker': 'raw',
     'MapInfo File': 'raw',
     'DGN': 'raw'}
    sangarshanan
    @Sangarshanan
    Hey I was looking to help out and maybe contribute to the postgis thread geopandas/geopandas#595. I took up @jorisvandenbossche points and added wkb and psql_insert_copy to get a slight speedup, I have just started my journey towards the open source gods so would appreciate any comments and suggestions : )
    Henrikki Tenkanen
    @HTenkanen
    Thanks @Sangarshanan for your help with the PostGIS writing functionality. :) I have now implemented a GeoDataFrame.to_postgis() method for geopandas. All comments and feedback are very wellcome. :) You can find more information from here: geopandas/geopandas#595
    Dani Arribas-Bel
    @darribas
    hello, is there any way to run something like st_make_valid on geopandas?
    Martin Fleischmann
    @martinfleis

    shapely mentions only .buffer(0)

    Passed a distance of 0, buffer() can sometimes be used to “clean” self-touching or self-crossing polygons such as the classic “bowtie”. Users have reported that very small distance values sometimes produce cleaner results than 0. Your mileage may vary when cleaning surfaces.

    But I was not always succesful using it
    Dani Arribas-Bel
    @darribas
    yeah I was having that issue... my problem really is that the Spanish Government decided it was OK to release official shapefiles with polygons on top of other polygons...
    so I need to "burn" those on top
    I'm trying st_make_valid in sf/lwgeom in R
    Martin Fleischmann
    @martinfleis
    But that just makes your polygons valid by themselves, it does not resolve the overlapping issue, does it?
    Dani Arribas-Bel
    @darribas
    ahh.. maybe
    Martin Fleischmann
    @martinfleis
    I haven’t seen the data, but assuming that boundaires are slightly overlapping, I would try snap them together, using something like this
    def snap(input, target, tolerance, min=True):
        """
        min True snaps to closest within tolerance, False to all within tolerance
    
        TODO: use rtree to get distances only to relevant geoms
        """
        for i, geom in input.iteritems():
            distances = target.distance(geom)
            if min:
                close_geom = target.loc[distances.idxmin()]
                geom = shapely.ops.snap(geom, close_geom, tolerance)
            else:
                close_geom = centroids.loc[distances < tolerance]
                for p in close_geom:
                    geom = shapely.ops.snap(geom, p, tolerance)
            snapped = cleaned.copy()
            snapped.loc[i] = geom
            return snapped
    Dani Arribas-Bel
    @darribas
    if you open it on QGIS or viz it in geopandas, you'll see very clearly there are some polygons on top of other polygons. Now those on top are part of multipart polygons
    Martin Fleischmann
    @martinfleis
    Got it. I would explode it, check for polygons within other polygons and remove them. Then dissolve back
    Dani Arribas-Bel
    @darribas
    my sense is that they have the polygon for one region and, inside that, there's a small bit that belongs to another region, so they just "put it on top"
    that sounds like a great idea, giving it a shot now. Thanks!!!
    Martin Fleischmann
    @martinfleis
    If you want to keep them, you can do overlay
    that should make a hole in the other
    Dani Arribas-Bel
    @darribas
    how'd I do that? That's more like reality I think
    geopandas.overlay(exploded, exploded)?
    Martin Fleischmann
    @martinfleis
    the safest way would be to explode, split gdf in two. Polygons within other polygons and the rest. and then do gpd.overlay(rest, within)
    Dani Arribas-Bel
    @darribas
    how'd I do the split?
    Martin Fleischmann
    @martinfleis
    give me a minute
    Martin Fleischmann
    @martinfleis
    gdf = gdf.explode()
    
    # this takes time without rtree
    within_idx = []
    for ix, r in gdf.iterrows():
        if gdf.drop(ix).contains(r.geometry).any():
            within_idx.append(ix)
    overlayed = gpd.overlay(gdf.loc[within_idx], gdf.loc[~gdf.index.isin(within_idx)], how='union‘)
    Then you can dissolve your geoms back based on name or something.
    Martin Fleischmann
    @martinfleis
    For some reason I am getting RTreeError: Coordinates must be in the form (minx, miny, maxx, maxy) or (x, y) for 2D indexes from overlay. Not sure why though.
    Martin Fleischmann
    @martinfleis
    Because of empty polygons. Do gdf = gdf.loc[~gdf.geometry.is_empty] after exploding. This dataset is a hell :D
    Dani Arribas-Bel
    @darribas
    hey! getting to work on this again! Thanks very much for all the time on this, we should put it on a gist or something, one would think official datasets are meant to be good...
    Martin Fleischmann
    @martinfleis
    feel free to do so. thanks to this I have at least found a bug #1260 :)
    Dani Arribas-Bel
    @darribas
    I'll fix my problem and polish it on a notebook for a gist
    Dani Arribas-Bel
    @darribas
    OK, after running into more problems, I've given up and I'm using the boundaries from gadm.org
    that dataset is so absurd...
    Levi John Wolf
    @ljwolf
    gadm's great, too, though! definitely should have a package that automatically grabs it in python like GADMTools in R...
    Logan Yoder
    @lyoder3
    Hi all,
    I'm working on a project which utilizes geopandas. Specifically, I'm trying to create a class for a specific type of shape file we use that inherits from GeoDataFrame. My goal is to make use of from_fileand create these new objects using that constructor. Then I want to be able to assign attributes to the new object based on regex matching parts of the filename passed to from_file(). Perhaps creating a subclass isn't the route to go, but I have some added functionality I'd like to add on top of the base class. Is there any way to do this in the new class I'm creating?
    Paul Hobson
    @phobson
    Hard to glean all of your requirements, but I don't think you need to subclass GeoDataFrame. I would write a new class that has all of the attributes you want and a self.data atttribute that is your geodataframe, and you other methods could work on that