@rajuthegr8 Welcome and thanks for your interest in GSoC!
In case you haven't seen it, there is some high-level info here: https://github.com/geopandas/geopandas/wiki/Google-Summer-of-Code-2021#pure-python-geopackage-io
pgpkg is just a starting point; contributions don't necessarily have to be there. Depending on what is involved, it may be appropriate for those changes to land directly in
geopandas in the end (but developing them outside geopandas would probably be easiest at the start)
pgpkgalready does some of this). We'd like the solution created under this project to produce geopackages that are compatible with the GPKG specification, so there will be a little bit involved to get a good understanding of the spec and how it manifests in the tables stored in the sqlite database that makes it a valid geopackage.
pgpgkisn't quite right; other GIS tools don't recognize the CRS info it creates)
pgpkgdoes a few hackish things to try and do this, but a better solution is likely needed. (context: geometry bytes in geopackage are a structured set of header bytes plus well-known binary of the actual geometry). We'd like this handled better so that it isn't hackish and meets the spec.
pgpkgand have a better idea of what must be done. From what I have understood I should probably start to put together a basic proposal after I get the template and in the meanwhile start to send some PRs to make `pgpkg' more compatible with the official specs. Any other suggestions or any advice would be highly appreciated.
pgpkgto capture your ideas / findings that can be addressed in later PRs (mostly I'm thinking of these of capturing a few of the technical specifics while they are fresh). You can certainly link these into your proposal, but like @martinfleis said, your focus should be on the overall proposal and bigger picture: overall approach, goals, what is your sense of how hard it will be to go to from what is available to meet the goals for the project, etc. (I probably got overly excited by the specifics I listed off above, apologies if that was a bit of a distraction from the overall proposal)
dask-geopandasis in an early stage of development so there is a lot to do. I would say that the ideal roadmap now would be 1) spatial partitioning, 2) spatial indexing, 3) overlapping computation, based on my own experience and needs (which may be biased). Maybe IO... You can check the issues https://github.com/geopandas/dask-geopandas/issues for a discussion on some of these. If you have some needs yourself, feel free to embed them in the proposal. Also note that we have submitted a workshop proposal around dask-geopandas to Dask Summit, so there may be a chance to have a good discussion on priorities during that with a wider range of people.
bugand find some you're comfortable fixing. They require various level of expertise and some will be easier than other so you should be able to find some you're comfortable with. If you want to work on documentation, #1896 is a nice one to start with.