Miner is a PHP library that extracting metadata and interesting text content (like author, summary, and etc.) from HTML pages. It acts like a simplified HTML metadata parser in Apache Tika.
R tools for processing and extracting clinical information from downloaded CPRD cohorts
Node module for extracting text from various file types