See Harvest homepage http://harvest.sourceforge.net/ for informations about Harvest.
Harvest is available for download at Harvest download page http://prdownloads.sourceforge.net/harvest/.
Harvest-ng is a reimplementation of Harvest's gatherer by Simon Wilkinson. You can get more info about Harvest-ng at Harvest-ng homepage http://webharvest.sourceforge.net/ng/.
The core of Harvest located in src directory is under GPL. Additional components, located in components directory are under GPL or similar copyright.
Harvest should run on any *nix like platforms including FreeBSD, Linux and Solaris.
Michael Schlenker has ported Harvest to Windows platforms using Cygwin http://sources.redhat.com/cygwin/.
A Pentium 120MHz with 64MB RAM should achieve reasonable performance for around 350 MB of fulltext data in ca. 20.000 objects. A Pentium 650MHz with 256MB RAM should be able to handle around 1.5 GB of fulltext data in ca. 100.000 objects.
After the original authors ceased working on Harvest, there were some periods where Harvest was unmaintained. During this time there were following forked versions of Harvest:
All these forked trees were merged into Harvest 1.6.
For initial setup, you must be able to modify the webserver configuration and to schedule cron jobs. After the initial setup, it is recommended to run Harvest as a different user for security reasons.
Put a line like this to your robots.txt:
User-agent: Harvest
Disallow: /
There are many ways to help depending your skills and time you want to contribute to improve Harvest: