I'd love to see what you came up with in a gist. Also, I'm curious if Coarray Fortran can suite your needs or if you are relying and certain low level MPI calls. Coarray Fortran can let you leverage one sided comms without the hassle and complexity of all the low level MPI calls. Intel has a decent (at least F2008 compliant) implementation, and GFortran w/ OpenCoarrays
supports some Fortran 2015 features like events in addition to all (I think*) F2008 features.