//div[contains(@class, 'm-callout')]//img/@src- that one seems to be the one we do usually detect, but I've checked just a few pages from your index, so can't tell you whether this would work globally.
//time[contains(@class, 'date-container')]XPath should do it in your case (based on https://unofficialsf.com/validation-checker-flow-action/) - we should automatically extract the timestamp. You'll need to re-index after changing this.
Hi @sociallyclimb_twitter we'd usually recommend having a sitemap, or using canonical meta tags - but I suppose this is not super easy to implement on your side.
Url replacements might be an option, e.g. replacing
&crid=(\d)+ with an empty string.
Another option might be blacklisting by a certain url params, but in that case some pages might get removed from the index (e.g. if you blacklist by an url param, but the url without the param isn't linked anywhere on your site).