Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Activity
  • Oct 13 11:45
    morganjwilliams commented #62
  • Oct 13 11:45
    morganjwilliams closed #62
  • Oct 13 11:41
    morganjwilliams closed #64
  • Oct 13 10:49
    alessandro-gentilini closed #63
  • Oct 13 10:49
    alessandro-gentilini commented #63
  • Oct 13 10:48
    alessandro-gentilini opened #64
  • Oct 13 09:06
    morganjwilliams commented #53
  • Oct 13 02:13
    morganjwilliams commented #63
  • Oct 12 17:39
    alessandro-gentilini opened #63
  • Oct 12 17:38
    alessandro-gentilini labeled #62
  • Oct 12 17:38
    alessandro-gentilini assigned #62
  • Oct 12 17:38
    alessandro-gentilini opened #62
  • Sep 13 09:38
    ChetanNathwani commented #53
  • Sep 13 02:51
    morganjwilliams commented #53
  • Sep 10 08:23
    ChetanNathwani commented #53
  • Sep 09 04:54
    morganjwilliams commented #53
  • Sep 09 04:21
    morganjwilliams commented #61
  • Sep 09 02:56
    morganjwilliams commented #53
  • Sep 07 01:28
    morganjwilliams commented #61
  • Sep 06 20:53
    antoine-gs closed #61
Lucy Mathieson
@LucyMathieson98_twitter
Hi @morganjwilliams thanks very much for letting me know! Fantastic
WVU VPL Graham
@gdma1977
Hi. First post from a new user. I have imported a panda array of REE + Y data. pyrolite only recognizes Gd and Tb as REEs in the dataframe when using df.head(5).pyrochem.REE, however, all are there in the order La, Ce...Lu, Y. It doesn't seem to make a difference if I add or remove an empty column for Pm. When I plot the spidergram only D and Tb are plotted. Any ideas?
The REE data have already been normalized
WVU VPL Graham
@gdma1977
The example data work fine
WVU VPL Graham
@gdma1977
I fixed my problem #50
Morgan Williams
@morganjwilliams
@gdma1977 glad you found a solution! Let us know if you run into anything else. If you don't have any more info you'd like to add for morganjwilliams/pyrolite#50 I'll close that shortly.
dlucas
@dlucasjr
Hello Morgan! I'm trying to create a spider plot but using a different color for each sample. I'm trying to assign the sample column of my data frame, but it's not working. I'm doing something like this: label=df['Sample']
Morgan Williams
@morganjwilliams
Hey @dlucasjr! Try using color=df["Sample"] and see if that's roughly what you're after? The label keyword argument is only used for legends, and in this case the labels won't match up with the lines/scatter as they're two different collections for spider plots (I tend to build them manually if needed).
dlucas
@dlucasjr
Thanks for your answer, Morgan, it worked fine! But as I cannot identify the colors generated automatically I specified them for generating the legend. There's an easy way to do this?? It's a simple task to do in a small dataset bur for a larger one this could be painful hahaha.
import matplotlib.pyplot as plt
import pyrolite
from matplotlib.lines import Line2D

ax = df1.pyrochem.normalize_to('Chondrite_SM89', units='ppm').pyroplot.REE(
    unity_line=True,
    alpha=0.6,
    color=['C0', 'C1', 'C2', 'C3', 'C4', 'C5', 'C6'],
    cmap=''
    #figsize=(30, 8)
    )

custom_lines = [Line2D([0], [0], color='C0', lw=1,),
                Line2D([0], [0], color='C1', lw=1),
                Line2D([0], [0], color='C2', lw=1),
                Line2D([0], [0], color='C3', lw=1),
                Line2D([0], [0], color='C4', lw=1),
                Line2D([0], [0], color='C5', lw=1),
                Line2D([0], [0], color='C6', lw=1)]

ax.legend(custom_lines, df1['Label'])
dlucas
@dlucasjr
image.png
Morgan Williams
@morganjwilliams
You can use process_color from pyrolite.plot.color (which spider is using behind the scenes) together with proxy_line from pyrolite.util.plot.legend to make this bearable:
from pyrolite.util.plot.legend import proxy_line
from pyrolite.plot.color import process_color

ax = df.pyrochem.normalize_to("Chondrite_PON", units='ppm').pyroplot.REE(color=df['Sample'],  unity_line=True, alpha=0.6,)

labels = df['Sample'].unique()
proxies = [proxy_line(color=c) for c in process_color(color=labels)['c']]
ax.legend(proxies, labels)
The key thing here is that as soon as you have more than one instance of a label in the dataframe you'll end up with duplicated legend entries; spider's using the unique values as we do here behind the scenes so these should line up.
Morgan Williams
@morganjwilliams
proxy_line is simply Line2D without needing to specify the coordinates; you can pass all line arguments to this - you can also pass marker arguments, but in a different way to how they'd be passed to spider (e.g. using markeredgecolor instead of edgecolors etc, as are the standard differences between plt.scatterand plt.plot plots).
Morgan Williams
@morganjwilliams
@dlucasjr hope that helps! Potentially some updates for spider plot legend proxies on the horizon, this could be a bit easier..
dlucas
@dlucasjr
Amazing answer, thanks!! If it's not too much work, the examples page on the website could have some examples of this until there. Maybe it'll be helpful for people having the same questions as mine.
Morgan Williams
@morganjwilliams
Definitely - that's part of the reason this forum exists - it feeds the documentation. 😁
Will look at adding something tomorrow
Morgan Williams
@morganjwilliams
Just a quick note that as of maptlotlib v3.4.0 there's an issue with ternary plot creation via mpltern; I've pinned the version <3.4 on the development version of pyrolite but if you see an error about a missing x axis when making ternary plots consider installing the previous version of matplotlib. I'll see if I can catch the mpltern creator next week to sort this out.
1 reply
@dlucasjr I also added a section to the spiderplot example page on legends and proxies for spider plots with a bit more detail on this, including how you might go about reordering the items if you wanted to.
dlucas
@dlucasjr
@morganjwilliams I just saw that, it's great! Thanks for your readiness to help us
dlucas
@dlucasjr
Hello! Here I'm again haha. @morganjwilliams how can I deal with some missing data on spider plots? Is it possible to just connect the points with a line without dropping the entire column?
Morgan Williams
@morganjwilliams
It should handle missing data natively from memory, but perhaps send through an example if it's not behaving how you'd expect. Will check back in the morning.
dlucas
@dlucasjr

Here's an example:

df_porf = pd.read_excel('etr_calckp.xlsx')
ax = df_porf.pyroplot.spider(unity_line=True, alpha=0.4, color='C3')

or

df_char = pd.read_excel('etr_charno.xlsx')
ax = df_char.pyroplot.spider(unity_line=True, alpha=0.6, color='C5')

Thanks!

porf.png
charn.png
Morgan Williams
@morganjwilliams
Ok, those come out how I'd generally intend them to (avoiding interpolation, as it ends up making it look like there's more data than there is). There will be workarounds for plotting across these gaps if you really want to; if I get a chance today I'll try and post one here.
dlucas
@dlucasjr
I would really appreciate this, thanks!
Morgan Williams
@morganjwilliams
Give this one a shot:
from pyrolite.util.missing import md_pattern

inds, patterns = md_pattern(df)  # get each of the unique missing data patterns from df

indexes = np.arange(df.columns.size)  # these are the x-axis index values for spider

spider_config = dict(
    alpha=0.6, color="C5"
)  # the configuration you'd like to use for this figure
# construct the plot one missing-data-pattern at a time so no NaN will get passed to plot
fig, ax = plt.subplots(1)

for ix in np.unique(inds):
    fltr = ~patterns[ix]["pattern"] # this is the boolean missing data pattern
    df.loc[inds == ix, df.columns[fltr]].pyroplot.spider(
        ax=ax,
        indexes=indexes[fltr],
        unity_line=True if ix == 0 else False,
        **spider_config
    )
# as of the current version of pyrolite you'll need reset ticks *after* the above
ax.set_xticks(indexes)
ax.set_xticklabels(df.columns)
It's a bit involved but might get you to where you want.
Morgan Williams
@morganjwilliams

Alternatively, if you want to do this for multiple spider plots, you can wrap it up in a function:

from pyrolite.util.missing import md_pattern
from pyrolite.util.plot.axes import init_axes


def nogap_spider(df, ax=None, figsize=(6, 4), **kwargs):
    """
    Create a spider plot without gaps for missing data (interpolate between existing
    data points).

    Parameters
    ----------
    df : pandas.DataFrame
        Data to plot.
    ax : matplotlib.axes.Axes
        Optional specification of which axes to plot on, if one exists.
    figsize : tuple
        Size of figure to create if ax is not specified.

    Returns
    -------
    ax : matplotlib.axes.Axes
    """
    # get each of the unique missing data patterns from df
    inds, patterns = md_pattern(df)

    indexes = np.arange(df.columns.size)  # these are the x-axis index values for spider

    # construct the plot one missing-data-pattern at a time so no NaN will get passed to plot
    if ax is None:
        ax = init_axes(figsize=figsize)

    for ix in np.unique(inds):
        fltr = ~patterns[ix]["pattern"]  # this is the boolean missing data pattern
        df.loc[inds == ix, df.columns[fltr]].pyroplot.spider(
            ax=ax,
            indexes=indexes[fltr],
            unity_line=True if ix == 0 else False,
            **kwargs
        )
    # as of the current version of pyrolite you'll need reset ticks *after* the above
    ax.set_xticks(indexes)
    ax.set_xticklabels(df.columns)
    return ax

Then use it like:

nogap_spider(df, alpha=0.6, color="C5")
@dlucasjr hope that's approximately what you're after!
dlucas
@dlucasjr
This is exactly what I was looking for! Thanks again @morganjwilliams
Morgan Williams
@morganjwilliams
Great, enjoy! 😁
Angusrog
@Angusrog

Hey @morganjwilliams, I'm a new user of pyrolite and complete python noob, and stuck on something that feels trivial! I have a simple spider diagram for 20-ish samples which are PM normalised, and I want to add in a reference line, e.g. MORB_Gale2013 (and normalise that to PM). I've tried:

MORB = get_reference_composition('MORB_Gale2013')
ax = MORB.pyrochem.REE.pyrochem.normalize_to('PM_PON').pyroplot.spider(unity_line=True, color='0.5', alpha=0.4)

And to my caveman understanding, the problem here is MORB_Gale2013 isn't recognised in the format I am trying to use it.
Apologies that I can't explain my problem more accurately, I don't have the knowledge of python to know how to describe it.
Here's the spider code I'm trying to work on in case that helps solve my conundrum.

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import pyrolite.geochem
import pyrolite.plot
from pyrolite.geochem.ind import REE
from pyrolite.geochem.norm import get_reference_composition, all_reference_compositions

from pyrolite.util.units import scale

filepath = r'C:\Users\Ait_csv_summary.csv'
df = pd.read_csv (filepath)
df = df.set_index('Sample', drop=True)

df.pyrochem.elements
els_ppm = df.pyrochem.elements
els_ppm.pyrochem.elements *= scale('ppb','ppm')

ax = els_ppm.pyrochem.REE.pyrochem.normalize_to('PM_PON').pyroplot.spider(unity_line=True, color='0.5', alpha=0.4)
ax.set_ylabel("X / $X_{Primitive Mantle}$")
Morgan Williams
@morganjwilliams
G'day @Angusrog, try MORB = get_reference_composition("MORB_Gale2013").comp. Will get back to you tomorrow if that doesn't sort out the issue and explain what's going on (on a laptop rather than a phone..).
Morgan Williams
@morganjwilliams
The other thing you'll need to do here is pass the existing axis to the second call to spider using pyroplot.spider(ax=ax, ...). The reason you need to add comp to get the composition dataframe is that get_reference_composition() returns a pyrolite.geochem.norm.Compositon object, rather than just a dataframe - mainly so the units can be adjusted etc.
Morgan Williams
@morganjwilliams
If you wanted to see some of the internals around how these compositions are imported etc, you can check out the source code for Composition and it's comp attribute (here).
Angusrog
@Angusrog
Thanks for the help @morganjwilliams, after a bit of messing around I got it to function!
Vaida Kirkliauskaite
@VaidaKirkliauskaite

Hey @morganjwilliams I have started plotting isotopic data on scatter diagrams. Problem is that I have lots of data sets from various deposits around the world. I wonder what is the better approach to comparing this overlaping data? How to construct density plot but also adding different polygons around each set from different deposit?

#Here are on y axis various deposits and on x axis Fe isotope values. I would like to put different colors for each dataset and draw polygons naming them as "magmatic origin", "low temperature magnetites"
df = pd.DataFrame(pd.read_excel("Reference.xlsx"))
df 
Iron = df["δ56Fe in ‰"]
Sample = df["Sample"]
x = []
y = []
x = list (Iron)
y = list (Sample)
plt.scatter(x,y)
plt.xlabel('δ56Fe in ‰')
plt.ylabel('Iron oxide-apatite deposits')
plt.title('Data')

pt1 = [0.1,0]
pt2 = [0.9,0]
pt3 = [0.9,11]
pt4 = [0.1,11]

line = plt.Polygon([pt1, pt2, pt3, pt4], closed=True, fill=None, edgecolor='r')
plt.gca().add_line(line)

plt.show ()
#In this one I want to separate deposits by drawing overlaping polygon, but maybe density plot could work better in this case , somehow  adding which deposit has more samples in magmatic range. 
df = pd.DataFrame(pd.read_excel("Reference.xlsx"))
df 
Oxygen = df["δ18O in ‰"]
Iron = df["δ56Fe in ‰"]
x = []
y = []
x = list (Oxygen)
y = list (Iron)
plt.scatter(x,y)
plt.xlabel('δ18O in ‰')
plt.ylabel('δ56Fe in ‰')
plt.title('Data')

pt1 = [1,0]
pt2 = [4,0]
pt3 = [4,1]
pt4 = [1,1]

line = plt.Polygon([pt1, pt2, pt3, pt4], closed=True, fill=None, edgecolor='r')
plt.gca().add_line(line)

plt.show ()

Thank you.
.

Morgan Williams
@morganjwilliams
Hey @VaidaKirkliauskaite! For the first one, it'll depend a bit on how things are organised in your dataframe, but I think something like below might work for you:
import pandas as pd
import matplotlib.pyplot as plt

df = pd.DataFrame(pd.read_excel("Reference.xlsx"))
# change this column name to reflect where you store deposit names
deposits = df["Deposit"].unique()

################################################################################
fig, ax = plt.subplots(1)

ax.set(xlabel="$\delta^{56}$Fe ($\perthousand$)", ylabel="Iron Oxide-Apatite Deposit")
ax.set_yticks(np.arange(len(deposits)))
ax.set_yticklabels(deposits)

ax.axvspan(0.1, 0.9, color="0.5", alpha=0.1, zorder=-1)

for ix, deposit in enumerate(deposits):
    xdata = df.loc[df["Deposit"] == deposit, "δ56Fe in ‰"].values
    ax.scatter(xdata, ix * np.ones_like(xdata))
With a little dummy dataset, I get something a bit like this.
image.png
Morgan Williams
@morganjwilliams
Note I used ax.axvspan() here instead of a polygon to identify the background for magmatic/low-T fields; this will enable you to put as many deposits as you like along the y axis and not have to worry about changing the positioning. This is essentially a polygon, so you can also specify edgecolor and linewidth parameters for it.
Morgan Williams
@morganjwilliams
For the second one, the approach might depend on how many deposits you have and how many data points you have for each. The density plots will only work well where you have a decent amount of data, otherwise they might be a bit misleading. And while you could add data density contours for each of them, if you have lots of deposits it might get a bit distracting. If you can get back to me with a bit of an idea of these aspects I'll try and provide a bit of a suggestion for one or two ways to go about it.
I'd also recommend checking out the more object-oriented aspects to matplotlib (like I use above, using methods of the axis object ax.set, ax.scatter etc rather than the pyplot interface e.g. plt.scatter) if you'd like a bit finer control over your plotting; most of the pyrolite examples tend to use this also.
Kris Sokol
@KSokol79

Hi @morganjwilliams I'm trying to get my head around Python and pyrolite, forgive the noob tone of the message but Im quite new to Python (only marginally better in R). Wondering if you can help out constructing two modified stem plots, which would resemble column charts?
alt

I'm not sure how to get 4 variables each assigned to an element, but I figure it would resemble a matplotlib grouped chart? It'd be nice to have a sequential color pallette increasing based on year. The attempt below doesn't work but that's as far as I got thinking it through.
Does pyroplot support subplots?
Do you think it's reasonable to try and do this in pyrolite?

Cheers

import pandas as pd
import matplotlib.pyplot as plt
from pyrolite.plot import pyroplot
from pyrolite.plot.stem import stem

EU_crit_ind = pd.read_csv('data.csv')
xcommodities = [EU_crit_indices['Symbols']]
y_set1 = ['SR*2011', 'SR*2014', 'SR2017', 'SR2020']
y_set2 = ['EI2011', 'EI2014', 'EI2017', 'EI2020']

SR_plot = pyroplot.stem(data=EU_crit_ind, x=['xcommodities'], y=['y_set1'], hue = cmaps['YlOrBr'], s=50, edgecolor='k', linewidth=0.5, legend=True)

EI_plot = pyroplot.stem(data=EU_crit_ind, x=['xcommodities'], y=['y_set2'], hue = cmaps['BuPu'], s=50, edgecolor='k', linewidth=0.5, legend=True)
Morgan Williams
@morganjwilliams
Hey @KSokol79! First, I just want to note that the pyrolite stem plots are generally designed to be used either directly from the dataframe (df[["x", "y"]].pyroplot.stem()) or with numpy arrays (from pyrolite.plot.stem import stem; stem(df["x"].values, df["y"].values)), and some seaborn concepts like passing data/x/y/hue won't work in pyrolite.
If you want something similar to a stacked column chart with 'grouped' stems per element, I'd suggest generating a series of indexes for the xaxis which are slightly offset (e.g. indexes = [np.arange(EU_crit_ind .Symbols.size) + 0.1 * x for x in range(4)]) and using these as the indexes for each of your stem series/years.
Morgan Williams
@morganjwilliams
pyrolite doesn't give you support for subplot generation per-se, but the API is generally compatible with matplotlib; usually I'd suggest generating subplots with matplotlib's plt.subplots().
As for the last question, pyrolite isn't particularly geared to this, but should be a similar amount of code to what you would do with matplotlib (which all the current plotting in pyrolite is all built from anyway).
Morgan Williams
@morganjwilliams
To get close to what you're after, I put this together (substitute EU_crit_ind for df where relevant); hopefully this points you in the right direction:

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from cycler import cycler
from pyrolite.plot.stem import stem

years = ["SR*2011", "SR*2014", "SR2017", "SR2020"]
# generate a set of offset indexes corresponding to each of these years
indexes = [
    np.arange(len(df["Symbols"])) + 1 * (0.1 * x - 0.15) for x in range(len(years))
]
norm = plt.Normalize(vmin=0, vmax=len(indexes) - 1)
cmap = plt.get_cmap("cividis")

# set up a color cycle for this axis corresponding to the above listed colormap
custom_cycler = cycler(color=[cmap(norm(ix)) for ix in range(len(indexes))])

# create the figure and axis
fig, ax = plt.subplots(1, figsize=(12, 4))
ax.set_prop_cycle(custom_cycler)  # set the cycler

for ix, (index, year) in enumerate(zip(indexes, years)):
    stem(index, df[year], ax=ax, label=year)

# set the symbols as the x axis labels
ax.set_xticks(np.arange(len(df["Symbols"])))
ax.set_xticklabels(df["Symbols"])
# take just the scatter collections from the axis (the dots of the stem plot)
# and use them to make the legend
ax.legend(ax.collections, years)