some months ago I started a new peer-to-peer project called loklak which is about collecting and searching tweets (as a start) from twitter. The architecture is done in such a way that it does not depend on a public API (it simply scrapes websites as all search engines does) and it is not supposed to be based only on twitter. Instead, it shall create a framework to collect status messages from any social media sites.
Several questions may come up why such a project should be done: (1) what is the purpose of the collection of status messages outside their home platform? (2) what is the connection to search engines in general and what is the use for YaCy? (3) what should the free software community do with this?
There are philosophical and practical aspects on these questions. I tried to give the philosophical answer on question (1) at the fossasia conference in Singapure this year:
In short: we now have monopolies (Google, Twitter, Facebook, Instagram) and they will stay for a long time. A free alternative to these plattforms cannot keep up and get better, because they have too much data to make it possible to be as good as they are. The only way to become as good as they are is, to create a collection mechanism to get their data in a free-software environment. That means, we need a data-bootstrap-harvester and that could be made with loklak.
The answer to (2) is important for the YaCy community: search portals do not any more scrape data from web pages, they also should be able to get information from those large-scale social communities. We don't have such a scraping mechanism in YaCy to be able to to this. In this context, loklak can be seen as a 'scraping-sister-project' which may be able to support YaCy to get into the social community data harvesting. I don't see that any other free software search engine has this capability. Using loklak as tool for YaCy will keep us up-to-date to the habits of the internet users to send their content not to web pages any more, but mostly to community message services like twitter and facebook.
The answer to (3) is important for all people who are looking for a free alternative to twitter: once there was identica, but this project is almost dead. An alternative to twitter needs a backend-server software which can organize such data, and loklak is such a server. Right now actually we are building a front-end on loklak which can be found currently on http://test.loklak.net which actually still uses twitter as authentication but will be able to work in its own someday.
tl;dr I made a twitter-like infrastructure named 'loklak' which may support YaCy to search within social community platforms. Facebook, Instagram may be added as well as data source.