Loklak & YaCy & Social Community Integration

Forum for developers

Loklak & YaCy & Social Community Integration

Beitragvon Orbiter » Di Jul 28, 2015 9:36 am

some months ago I started a new peer-to-peer project called loklak which is about collecting and searching tweets (as a start) from twitter. The architecture is done in such a way that it does not depend on a public API (it simply scrapes websites as all search engines does) and it is not supposed to be based only on twitter. Instead, it shall create a framework to collect status messages from any social media sites.

Several questions may come up why such a project should be done:
(1) what is the purpose of the collection of status messages outside their home platform?
(2) what is the connection to search engines in general and what is the use for YaCy?
(3) what should the free software community do with this?

There are philosophical and practical aspects on these questions. I tried to give the philosophical answer on question (1) at the fossasia conference in Singapure this year:

In short: we now have monopolies (Google, Twitter, Facebook, Instagram) and they will stay for a long time. A free alternative to these plattforms cannot keep up and get better, because they have too much data to make it possible to be as good as they are. The only way to become as good as they are is, to create a collection mechanism to get their data in a free-software environment. That means, we need a data-bootstrap-harvester and that could be made with loklak.

The answer to (2) is important for the YaCy community: search portals do not any more scrape data from web pages, they also should be able to get information from those large-scale social communities. We don't have such a scraping mechanism in YaCy to be able to to this. In this context, loklak can be seen as a 'scraping-sister-project' which may be able to support YaCy to get into the social community data harvesting. I don't see that any other free software search engine has this capability. Using loklak as tool for YaCy will keep us up-to-date to the habits of the internet users to send their content not to web pages any more, but mostly to community message services like twitter and facebook.

The answer to (3) is important for all people who are looking for a free alternative to twitter: once there was identica, but this project is almost dead. An alternative to twitter needs a backend-server software which can organize such data, and loklak is such a server. Right now actually we are building a front-end on loklak which can be found currently on http://test.loklak.net which actually still uses twitter as authentication but will be able to work in its own someday.

tl;dr
I made a twitter-like infrastructure named 'loklak' which may support YaCy to search within social community platforms. Facebook, Instagram may be added as well as data source.
Orbiter
 
Beiträge: 5797
Registriert: Di Jun 26, 2007 10:58 pm
Wohnort: Frankfurt am Main

Re: Loklak & YaCy & Social Community Integration

Beitragvon smokingwheels » Mi Jul 29, 2015 1:42 pm

Wow I will be ready to test any Beta you care to Dream Up.

I have tried scrape Instagram with Yacy but I have too many URL's.

FaceBook has a few conditions https://www.facebook.com/apps/site_scraping_tos_terms.php.
There is a lot of Excuses to come up with like adding a random timer to the scraper.
smokingwheels
 
Beiträge: 137
Registriert: Sa Aug 31, 2013 7:16 am


Zurück zu YaCy Coding & Architecture

Wer ist online?

Mitglieder in diesem Forum: 0 Mitglieder und 2 Gäste

cron