I previously relied upon Yahoo Boss API while experimenting with YaCy alongside for testing the grounds for a possible transition. Last month Yahoo triplicated its API pricing so it's become more important to properly test YaCy capabilities on suitable hardware and decide for a possible move.
For the sake of the hype, here's the hardware
- Motherboard: dual-CPU 771 Supermicro
- RAM: 32GB ECC fully-buffered DDR2 (in picture: 24GB by the right box, 16GB mounted on MB. Final configuration will have 32GB)
- CPU: 2 x Intel Xeon L5420, quad-core (yet to arrive)
- Storage: 4 x 1TB, 7200 rpm (2 SATA, 2 SCSI) [2 for RAID1, 1 for backup, 1 host filesystem]
All will be assembled in a week when the Xeon's arrive.
The existing YaCy virtual machine (currently running on a smaller server) will run under OpenVZ and will be moved to this server soon as it's assembled.
The target documents count is 100M and the main sources for crawling will be Amazon, Newegg and Best Buy. The node publicly shares its indexes and supports the global community network