by

Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Repo info
Activity
  • May 29 18:05
    pomadchin commented #3247
  • May 29 18:05
    pomadchin labeled #3247
  • May 29 18:05
    pomadchin commented #3247
  • May 29 17:56
    jdenisgiguere opened #3247
  • May 29 01:35
    iceland1906 opened #3246
  • May 21 20:23
    pomadchin closed #3245
  • May 21 19:55
    pomadchin synchronize #3245
  • May 21 19:54
    pomadchin opened #3245
  • May 21 12:59

    pomadchin on master

    WKTParser extension support (#3… (compare)

  • May 21 12:59
    pomadchin closed #3244
  • May 21 12:59
    pomadchin closed #3241
  • May 20 23:52
    pomadchin synchronize #3244
  • May 20 23:50
    pomadchin synchronize #3244
  • May 20 19:34
    CloudNiner assigned #3244
  • May 20 19:34
    CloudNiner review_requested #3244
  • May 16 13:20
    pomadchin synchronize #3244
  • May 16 13:09
    pomadchin edited #3244
  • May 15 17:14
    pomadchin synchronize #3244
  • May 15 17:11
    pomadchin synchronize #3244
  • May 15 16:46
    pomadchin edited #3244
Grigory
@pomadchin
ah so you are buffering and after that doing crop so the result is if the same area right?
iceland1906
@iceland1906
yes
just did another quick test, found something strange, the actual inputRdd is an output from another buffering operation, if I remove the buffering operation in the previous step, then no error
I will upload a code snippet to explain what I mean shortly
Grigory
@pomadchin
yea, if you would provide some complete example it would be much easier to help :+1:
iceland1906
@iceland1906
def main(inRdd1: RDD[(SpatialKey, Tile)]): Unit = {
    // Given the input rdd, call funcOne and funcTwo sequentially to do some operations
    // inRdd1, rdd2, and rdd3 are expected to have same tile size
    // funcOne and funcTwo actually do similar work: do buffering -> some operation -> crop back to original size

    val rdd2: RDD[(SpatialKey, Tile)] = funcOne(rdd1) // works

    val rdd3: RDD[(SpatialKey, Tile)] = funcTwo(rdd2) // error happened inside
}

def funcOne(inRdd1: RDD[(SpatialKey, Tile)]): RDD[(SpatialKey, Tile)] = {
    val bufferedRdd1 = inRdd1.bufferTiles(12)
    val outRdd1 = bufferedRdd1.mapValues(
      rddTile => {
        val bufferedTile = rddTile.tile
        val targetArea = rddTile.targetArea

        val editedBufferedTile = func1(bufferedTile)

        editedBufferedTile.crop(targetArea)
      })
    outRdd1
}

def funcTwo(rdd2: RDD[(SpatialKey, Tile)]: Unit = {
    val bufferedRdd2 = rdd2.bufferTiles(12)
    val outRdd2 = bufferedRdd2.mapValues(
      rddTile => {
        val bufferedTile = rddTile.tile
        val targetArea = rddTile.targetArea

        val editedBufferedTile = func2(bufferedTile)

        editedBufferedTile.crop(targetArea)
      })
    println(outRdd2.count())
}
here is a more complete script
iceland1906
@iceland1906
seems to me that do bufferTiles() to an Rdd which was an output from another bufferTiles() operation does not work
iceland1906
@iceland1906
hope I explained it clear
gispathfinder
@zyxgis
I want to set dfs.replication parameter when I save vector tile
but the geotrellis.spark.store.hadoop.SaveToHadoop class
  def setup[K](
    rdd: RDD[(K, Array[Byte])],
    keyToUri: K => String
  ): RDD[(K, Array[Byte])] = {
    rdd.mapPartitions { partition =>
      saveIterator(partition, keyToUri, new Configuration){ (k, v) => v }
   }
  }
the Configuration parameter is not exposed
Grigory
@pomadchin
hey @zyxgis I think new Configuration can pick up the input job configuration; correct me if I’m wrong
but you’re welcome to create a PR with exposing this coniguration object!
do you want to do it yourself? I can help you with preparing a PR so we can merge it faster :+1:
Grigory
@pomadchin
hey @iceland1906 I will look into it a bit later but yea it looks like chained buffering is not supported ._.
iceland1906
@iceland1906
@pomadchin thank you
gispathfinder
@zyxgis
@pomadchin Thank you for your trust. I am very happy to try to do it.
ravi006
@ravi006
Hi @pomadchin , in geotrellis-sbt-template i am trying to add spark-sql libraries but throwing main method not found error when i build fat jar
Grigory
@pomadchin
Hi @ravi006, what are your dependencies in a build sbt file?
ravi006
@ravi006
@pomadchin
licenses := Seq("Apache-2.0" -> url("http://www.apache.org/licenses/LICENSE-2.0.html"))

scalacOptions ++= Seq(
  "-deprecation",
  "-unchecked",
  "-Yinline-warnings",
  "-language:implicitConversions",
  "-language:reflectiveCalls",
  "-language:higherKinds",
  "-language:postfixOps",
  "-language:existentials")

publishMavenStyle := true
publishArtifact in Test := false
pomIncludeRepository := { _ => false }

resolvers ++= Seq(
  "locationtech-releases" at "https://repo.locationtech.org/content/groups/releases",
  "locationtech-snapshots" at "https://repo.locationtech.org/content/groups/snapshots"
)

libraryDependencies ++= Seq(
  "org.locationtech.geotrellis" %% "geotrellis-spark" % "2.0.0",
  "org.apache.spark"      %% "spark-core"       % "2.0.0" % Provided,
  "org.apache.spark" %% "spark-sql" % "2.0.0",
  "org.scalatest"         %%  "scalatest"       % "2.2.0" % Test
)

// When creating fat jar, remote some files with
// bad signatures and resolve conflicts by taking the first
// versions of shared packaged types.
assemblyMergeStrategy in assembly := {
  case "reference.conf" => MergeStrategy.concat
  case "application.conf" => MergeStrategy.concat
  case "META-INF/MANIFEST.MF" => MergeStrategy.discard
  case "META-INF\\MANIFEST.MF" => MergeStrategy.discard
  case "META-INF/ECLIPSEF.RSA" => MergeStrategy.discard
  case "META-INF/ECLIPSEF.SF" => MergeStrategy.discard
  case _ => MergeStrategy.first
}

initialCommands in console := """
 |import geotrellis.raster._
 |import geotrellis.vector._
 |import geotrellis.proj4._
 |import geotrellis.spark._
 |import geotrellis.spark.io._
 |import geotrellis.spark.io.hadoop._
 |import geotrellis.spark.tiling._
 |import geotrellis.spark.util._
 """.stripMargin
Grigory
@pomadchin
ravi006
@ravi006
@pomadchin
i tried above one but its giving type error,
assembly / assemblyMergeStrategy := {
  case "reference.conf" => MergeStrategy.concat
  case "application.conf" => MergeStrategy.concat
  case PathList("META-INF", xs@_*) =>
    xs match {
      case ("MANIFEST.MF" :: Nil) => MergeStrategy.discard
      // Concatenate everything in the services directory to keep GeoTools happy.
      case ("services" :: _ :: Nil) =>
        MergeStrategy.concat
      // Concatenate these to keep JAI happy.
      case ("javax.media.jai.registryFile.jai" :: Nil) | ("registryFile.jai" :: Nil) | ("registryFile.jaiext" :: Nil) =>
        MergeStrategy.concat
      case (name :: Nil) => {
        // Must exclude META-INF/*.([RD]SA|SF) to avoid "Invalid signature file digest for Manifest main attributes" exception.
        if (name.endsWith(".RSA") || name.endsWith(".DSA") || name.endsWith(".SF"))
          MergeStrategy.discard
        else
          MergeStrategy.first
      }
      case _ => MergeStrategy.first
    }
  case _ => MergeStrategy.first
}
its giving compile error in intellij
ravi006
@ravi006
@pomadchin
thank you so much i copied pasted in this portion and its working
assemblyMergeStrategy in assembly := {
  case "reference.conf" => MergeStrategy.concat
  case "application.conf" => MergeStrategy.concat
  case PathList("META-INF", xs@_*) =>
    xs match {
      case ("MANIFEST.MF" :: Nil) => MergeStrategy.discard
      // Concatenate everything in the services directory to keep GeoTools happy.
      case ("services" :: _ :: Nil) =>
        MergeStrategy.concat
      // Concatenate these to keep JAI happy.
      case ("javax.media.jai.registryFile.jai" :: Nil) | ("registryFile.jai" :: Nil) | ("registryFile.jaiext" :: Nil) =>
        MergeStrategy.concat
      case (name :: Nil) => {
        // Must exclude META-INF/*.([RD]SA|SF) to avoid "Invalid signature file digest for Manifest main attributes" exception.
        if (name.endsWith(".RSA") || name.endsWith(".DSA") || name.endsWith(".SF"))
          MergeStrategy.discard
        else
          MergeStrategy.first
      }
      case _ => MergeStrategy.first
    }
  case _ => MergeStrategy.first
}
this is working :)
Grigory
@pomadchin
Perfect :+1:
Grigory
@pomadchin
@ravi006 I would also recommend you to use GT 3.x :D
ravi006
@ravi006
@pomadchin
Yeah, i would like to use 3.x and this version has lots of futures but my spark version is 2.0.0.
Can i use GT 3.x with Spark version 2.0.0
Grigory
@pomadchin
Ahhh; dunno; I guess not ._. but you can try
I’m pretty sure some runtime errors may happen
John
@Canadianboy122_twitter
Hi all, I just get known with geotrellis. So i cant get one thing. When i modify for example raster layer how can i save it locally as tiff ? Cant find any example.
iceland1906
@iceland1906
@pomadchin does gt has shapefilewriter feature? only found shapefilereader
Grigory
@pomadchin
Hi @Canadianboy122_twitter you totally can; how do you work with rasters through gt?
hi @iceland1906 nope we don’t have any API exposed, but we depend on http://docs.geotools.org/latest/javadocs/org/geotools/data/shapefile/shp/ShapefileWriter.html so you can use it
Also can you document this issue and file it as a feature request? I guess we never had it because no one asked for that
iceland1906
@iceland1906
will do. thanks for the link @pomadchin
ravi006
@ravi006
Hi @pomadchin ,is there geotrellis providing any libraries for shapefile with spark
Grigory
@pomadchin
Hi @ravi006 what do you mean by that, could you describe a use case?
but tbh we don’t have any API around it (: but we can build it / probably some workarounds are acceptable
ravi006
@ravi006
@pomadchin , i didn't see inputRDD reader for shapefiles, any libraries which can convert shpefiles into spark RDDs
Grigory
@pomadchin
@ravi006 yea, we don't have any; you can do it manually via parellilizing uris to shapefile and reading them in on executors; smth like
sc.parallelize(uris).map(readShapefileFromTheURI)
and also you can create an issue with a feature request (:
would this work for you, or you need some more complicated case to handle?
John
@Canadianboy122_twitter
@iceland1906 I load tiff with geopyspark.geotiff.get and then for example i mask part of that . How can i save only masked area? I found save stiched after union bit cant find way to save one.
John
@Canadianboy122_twitter
@pomadchin wrong tag sry
Grigory
@pomadchin
@Canadianboy122_twitter ah I think we planned to use GeoPySpark with GeoTrellis layers mostly
I don’t remember was there any tiff API exposed at all
so if I read you correct, it looks like you’re mostly interested in the python API?