Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Repo info
Activity
  • Jan 08 02:41
    bocklund edited #141
  • Jan 08 02:41

    bocklund on master

    DOC: Fix README - malformed lin… (compare)

  • Jan 08 02:41
    bocklund closed #141
  • Jan 08 01:00
    bocklund synchronize #141
  • Jan 08 00:21
    bocklund opened #141
  • Jan 07 23:41

    bocklund on docs-conda-forge

    (compare)

  • Jan 07 23:41

    bocklund on master

    DOC/TST: Update docs, drop pyca… (compare)

  • Jan 07 23:41
    bocklund closed #139
  • Jan 07 23:40
    bocklund closed #140
  • Jan 07 23:40

    bocklund on master

    MAINT/TST: Add Python 3.9 to se… (compare)

  • Jan 07 22:26
    bocklund opened #140
  • Jan 07 22:23
    bocklund synchronize #139
  • Jan 07 22:23

    bocklund on docs-conda-forge

    Fix readme syntax link (compare)

  • Jan 07 22:21
    bocklund opened #139
  • Jan 07 22:21

    bocklund on docs-conda-forge

    Remove :ref: role in favor of a… (compare)

  • Jan 07 22:14

    bocklund on docs-conda-forge

    Clean up RST comment Update readme to closely match … Remove reference to pycalphad A… and 8 more (compare)

  • Dec 14 2020 21:46

    bocklund on master

    DOC: Enable testing docs, disab… (compare)

  • Dec 14 2020 21:46
    bocklund closed #138
  • Dec 14 2020 21:25
    bocklund synchronize #138
  • Dec 14 2020 21:25

    bocklund on docs-test-revamp

    Add readthedocs config file (compare)

Semen Kuzovchikov
@spike_k_gitlab
TRACE:root:Probability for initial parameters
/home/semikk/anaconda3/lib/python3.7/site-packages/pycalphad/core/lower_convex_hull.py:136: RuntimeWarning: invalid value encountered in double_scalars
  result_array_GM_values[it.multi_index] = new_energy / molesum
TRACE:root:Likelihood - 6.71s - Thermochemical: -1004.646. ZPF: -3424.380. Activity: -14400.540. Total: -18829.566.
TRACE:root:Proposal - lnprior: 0.0000, lnlike: -18829.5658, lnprob: -18829.5658
Brandon Bocklund
@bocklund
And what's the contribution to each of the types of data for chains with very low probability? I'm guessing that it's primarily the activity data that is reduced
Semen Kuzovchikov
@spike_k_gitlab
I guess you're right
TRACE:root:Likelihood - 15.94s - Thermochemical: -966.256. ZPF: -15987.196. Activity: -772364.421. Total: -789317.873.
Brandon Bocklund
@bocklund
The log-likelihood for the activity data in that proposal is about 50x greater than the input, so the set of parameters that correspond to that proposal will be rejected because the probability is much lower (since the overall probability is so much higher than the probability of any of your chains).
Can you find any examples where the Total is close to your -15,500 or -16,000?
For more context: the output will contain the likelihoods for all proposed sets of parameters (accepted and rejected), but the ones you see in the plot of -lnprob are only the accepted parameters that are in the Markov chains
Just based on how large the log-likelihood is there, my guess is that one or more of your parameters have diverged and you are sampling from very unreasonable parameters.
Can you see if the values of the parameters of the chains that have significant jumps in probability are reasonable? Plotting the parameters might be helpful, i.e. https://espei.org/en/latest/recipes.html#visualize-the-trace-of-each-parameter
Semen Kuzovchikov
@spike_k_gitlab
I'm sorry I've scrolled through all my available output and I can't find any of examples with Total close to -16,000. All of them either too large or near -18,000
Brandon Bocklund
@bocklund
That's okay. I'm pretty sure it's the activity data. Another way to check would be to start a new run for 1 iteration using the output database from your 1000 iteration runs as the input database, since the final database that you get is based on the parameter set with the highest probability.
But I think it would be more valuable to check into the parameters and see what their values are
Semen Kuzovchikov
@spike_k_gitlab
Parameter values don't seem unreasonable to me. What I is that starting from nearly 200 iteration some walkers just stop changing alt
alt
Brandon Bocklund
@bocklund
Are these two parameters corresponding to coefficients a and b in an a+bT excess parameter?
Semen Kuzovchikov
@spike_k_gitlab
Yes they are, that's why they so differ in absolute values
Brandon Bocklund
@bocklund
Since the second one is a coefficient that's multiplied by T in the model, that's actually a really big change. A 40x change (-5 to -200) in this parameter would give a decrease in the energy of 40 kJ at 1000 K. Usually a good rule of thumb is that the b coefficient should be roughly 1/3000th of the a coefficient. So for a = -14000, b ~= -5.
Brandon Bocklund
@bocklund

I'd double check your activity data, plotting with the pycalphad link I sent the other day, if possible.

I noticed in the files you sent me before that there are some unusual values, e.g. Si-acr-75cha_1.json has

"X_SI": [
      0.650,
      0.600,
      0.500,
      0.399,
      0.303,
      0.200,
      0.099,
      0.790,
      0.700
    ]

and

    "values":   [[[
      0.597,
      0.513,
      0.356,
      0.196,
      0.104,
      0.078,
      0.033,
      0.004,
      0.080
      ]]],

For a reference of pure Si, the compositions should follow the trend in the mole fraction of Si. The last two compositions are for X(SI)=0.790 and X(SI)=0.700, which don't match up well with the activities

The ones with pure Al reference state all seem okay, but some other pure Si ones you may want to check into, includuing Si-acr-75cha_2.json and Si-acr-75cha_3.json (maybe the activites are just out of order in this one?)
Semen Kuzovchikov
@spike_k_gitlab
Ok, my bad. Thank you, I should plot and double-check all the input data now, there may be other kind of errors. I'll reply if I encounter this problem after data checks.
Brandon Bocklund
@bocklund
My current hypothesis is that some of the inconsistent data may be contributing to an unrealistically large log-likelihood and that these non-physical data are dominating the likelihood. If that's true, then some of your parameters (like the second one above) may be able to wander parameters space until they find some sets of values that drastically reduce the log-likelihood for the non-physical values of activity, but these are just unlucky guesses and that causes the chains to get stuck there.
I have some alternate hypotheses we can explore if the bad data hypothesis turns out to be wrong or not fix the issue
bcmiles
@bcmiles
image.png
Hey, I'm currently trying to run MCMC optimization on a system and hit an error relating to the symengine_wrapper and lambdify. It appears that the issue is related to lambdify not accepting the piecewise function but from what I've seen that should have been resolved already. I'm attaching the error message I get below, any help is much appreciated.
Richard Otis
@richardotis
@bcmiles Does the solution in this issue happen to fix things for you? https://github.com/pycalphad/pycalphad/issues/277#issuecomment-681197147
bcmiles
@bcmiles
No, that gives me a new error since python says that pycalphad 0.8.3 requires symengine == 0.6 and 0.6.1 is incompatible
Richard Otis
@richardotis
It doesn't work if you install it with pip?
You aren't supposed to mix conda and pip packages in the same environment, but that's why this is a workaround
bcmiles
@bcmiles
When I installed with pip it gave me the warning about incompatability
So I need to create a new environment and try again?
Brandon Bocklund
@bocklund
@bcmiles This is the same error that I get when I run your files, if you're still using those. I don't think it's related to a set of packages or an environment. It might be related to the something in the ionic liquid model that's being used here. I still haven't tracked it down.
bcmiles
@bcmiles
@bocklund thanks, I'll keep trying things on my end and see what works
Brandon Bocklund
@bocklund
I'm still investigating on my end as well, I'll keep you updated!
bcmiles
@bcmiles
@bocklund Looks like lambdify didn't like the naming done to the liquid in the TDB. I originally had it to match the thermoCalc version of using :Y after the phase name to denote it as an ionic liquid but removing it allows the mcmc to start running
bcmiles
@bcmiles
I'm currently trying to reproduce the Cu-Mg example, specifically plotting the data but I'm running into the following error when my code hits the multiplot() line "Process finished with exit code -1073741819 (0xC0000005)" I can plot ZPF data just fine using dataplot but multiplot is giving me errors, anyone know what could cause this?
juejing-liu
@juejing-liu

@bocklund
Hi Brandon,

I would like to know can ESPEI generate TDB files for the metal-gas binary systems (e.g. Fe-O, U-O)?

Best,

Juejing

Brandon Bocklund
@bocklund
Welcome, @juejing-liu! There are no limitations on which elements/species you use for parameter generation. You are only limited by your choice of model, i.e. for parameter generation, only Redlich-Kister polynomial solution model parameters can be generated (associate, ionic liquid, and quasichemical models are not supported)
juejing-liu
@juejing-liu
Ok, I am mostly working on the ionic liquid model (like U-O). Is there any plan to support this model?
Brandon Bocklund
@bocklund

For the direct parameter generation, it's unlikely that it will be supported because the ionic liquid requires doing the energy minimization step to determine the site ratios of the cation and anion sublattices which allow the liquid to charge balance.

On the other hand, because the ionic liquid model is supported by pycalphad, you can use ESPEI to try to optimize the parameter values via MCMC if you supply a parameterized model with reasonable starting values

juejing-liu
@juejing-liu
Oops I made a mistake. I am working on the ionic solid instead of the ionic liquid. Could ESPEI handle this model?
Brandon Bocklund
@bocklund

pycalphad doesn't support the ionic solid model yet (we had a closed PR that attempted to address it here: pycalphad/pycalphad#222). We have some solver changes upcoming that might make it worth addressing again.

I think that it would be possible to support parameter generation for ionic soilds, but that's not a high priority on our current roadmap for ESPEI and it would be pending support for ionic solids from pycalphad.

Pcascone
@Pcascone
I am trying to install espei via anaconda prompt as per the installation instructions on Windows 10 32bit but I get a version incompatibility error. "Output in format: requested package -> Available versions." pycalphad is the most recent build 0.8.4.
David Kleiven
@davidkleiven_gitlab

When I try to fit using MCMC I get some error. Here is how I started (datasets are stored in a folder called datasets):

  1. espei --check-datasets datasets: Finishes with no errors
  2. espei --input espei-in.yml: Finishes with no errors

where espei-in.yml looks like this:
system:
phase_models: mgsnX_input.json
datasets: datasets
generate_parameters:
excess_model: linear
ref_state: SGTE91
output:
output_db: mgsn.tdb

  1. espei --input mcmc.yml: This command causes errors

where mcmc.yml looks like this
system:
phase_models: mgsnX_input.json
datasets: datasets
mcmc:
iterations: 1000
input_db: mgsn.tdb
output:
output_db: mgsn_mcmc.tdb

I get the following error:

/home/davidkleiven/.local/lib/python3.7/site-packages/pycalphad/codegen/callables.py:97: UserWarning: State variables in build_callables are not {N, P, T}, but {T, P}. This can lead to incorrectly calculated values if the state variables used to call the generated functions do not match the state variables used to create them. State variables can be added with the additional_statevars argument.
"additional_statevars argument.".format(state_variables))
Traceback (most recent call last):
File "/home/davidkleiven/.local/bin/espei", line 11, in <module>
load_entry_point('espei', 'console_scripts', 'espei')()
File "/home/davidkleiven/Documents/ESPEI/espei/espei_script.py", line 311, in main
run_espei(input_settings)
File "/home/davidkleiven/Documents/ESPEI/espei/espei_script.py", line 260, in run_espei
approximate_equilibrium=approximate_equilibrium,
File "/home/davidkleiven/Documents/ESPEI/espei/optimizers/opt_base.py", line 36, in fit
node = self._fit(symbols, datasets, args, kwargs)
File "/home/davidkleiven/Documents/ESPEI/espei/optimizers/opt_mcmc.py", line 238, in _fit
self.predict(initial_guess,
ctx)
File "/home/davidkleiven/Documents/ESPEI/espei/optimizers/opt_mcmc.py", line 305, in predict
non_eq_thermochemical_prob = calculate_non_equilibrium_thermochemical_probability(parameters=np.array(params), *
non_equilibrium_thermochemical_kwargs)
File "/home/davidkleiven/Documents/ESPEI/espei/error_functions/non_equilibrium_thermochemical_error.py", line 271, in calculate_non_equilibrium_thermochemical_probability
points=data['calculate_dict']['points'])[output]
File "/home/davidkleiven/Documents/ESPEI/espei/shadowfunctions.py", line 55, in calculate
largest_energy=float(1e10), fake_points=fp)
File "/home/davidkleiven/.local/lib/python3.7/site-packages/pycalphad/core/calculate.py", line 190, in _compute_phase_values
param_symbols, parameter_array = extract_parameters(parameters)
File "/home/davidkleiven/.local/lib/python3.7/site-packages/pycalphad/core/utils.py", line 361, in extract_parameters
parameter_array_lengths = set(np.atleast_1d(val).size for val in parameters.values())
AttributeError: 'NoneType' object has no attribute 'values

has anyone seen this before or have can point me in the right direction? Thanks in advance for your help.

Brandon Bocklund
@bocklund
Hey @davidkleiven_gitlab, thanks for checking in here. This should be fixed by PhasesResearchLab/ESPEI#133 and a new release should be coming to conda-forge today
David Kleiven
@davidkleiven_gitlab
thanks, it worked now.
zhanghaihui
@zhanghaihui_gitlab
@bocklund Hi Brandon, would you mind explaining the definition and usage of data_weights for me in detail? The description in the manual is too short, I can’t understand how to use this key. Thanks.
Brandon Bocklund
@bocklund

@zhanghaihui_gitlab ESPEI computes the log-likelihood for every type of data by finding value of the log probability density function of a normal distribution centered at zero with a standard deviation of σ, i.e. norm(loc=0, scale=σ).logpdf(error) where error is the difference between the expected and calculated value for a particular set of parameters.

Since we have multiple types of data, a default value of the σ_initial exists for all supported data types to handle the different scales of error and to make the likelihoods comparable even though the errors are not. The data_weights modifies the σ used in the normal distribution by making σ = σ_initial/data_weight[data_type] for each data_type like ZPF or HM. By default, all data_weight values are 1.0, i.e. σ = σ_initial.

For example: consider that we have two experiments: one for enthalpy of formation data and one for entropy of formation data. If we calculate the error as the difference between the expected enthalpy of formation and the calculated enthalpy of formation, we might find the difference is 10 J/mol. If we do the same for the entropy of formation, we might find that the error is 5 J/K-mol.

Now we need to determine the likelihood for these two errors, one with a difference between the expected/calculated value of 10 and another of 5. Using our existing knowledge about the relative magnitudes of formation enthalpy vs. entropy, we know that an error of 10 J/mol in the enthalpy of formation is very close to the solution (i.e. high likelihood), while an error of 5 J/K-mol in the enthalpy of formation is probably not that close to the solution (i.e. low likelihood). The σ_initial are set so that the likelihoods of these two cases are comparable, but you might not agree with my choice of σ_initial or you might want to value a particular type of data more heavily than I do. data_weights lets you do that.

Let me know if this helps to clarify data_weights. I can update the docs with an edited version of this to help answer this question in the future
zhanghaihui
@zhanghaihui_gitlab
@bocklund Thank you very much for your answer, although I did not understand it before. Recently, I got it after learning some theoretical knowledge of statistics, but it was still very difficult before.
zhanghaihui
@zhanghaihui_gitlab
@bocklund Another question is how to determine the likelihood for difference errors and how to using existing knowledge about the relative magnitudes to judge it is available in detail?
zhanghaihui
@zhanghaihui_gitlab
I found it could judge some through log but I'm not sure. For example, when I change the data_weights of ZPF to 20, I found the likelihood of ZPF increase from -3000 to -800, and total likelihood also increase from -5000 to -1800. But the likelihood of Non-equilibrium thermochemical increase from -1300 to -800 although I did not change that parameter. I think the increase is too much. Is that normal? Is the reasonable likelihood higher than -2000?
bocklund
@bocklund:matrix.org
[m]
Test from Matrix