MR70

Statistics of #Hadoop #Opensource #Expo2015 tweets in January 2015

 2015-02-05 |  #twitter

How easy is analyzing json twitter data using Apache Spark and Apache Hadoop. Below some examples Tweets about Expo2015 Tweets about Hadoop Tweets about opensource

Continue reading 

Howto managing tweets saved in #Hadoop using #Apache #Spark SQL

 2015-01-15 |  #Me

Instead of using the old Hadoop way (map/reduce), I suggest using the newer and faster way (Apache Spark on top of Hadoop Yarn): in few lines you can open all tweets (zipped json files saved in several subdirectories hdfs://path/to/YEAR/MONTH/DAY/*gz) and query them in a SQL like language``` sc = SparkContext(appName=“extraxtStatsFromTweets.

Continue reading 

Howto managing tweets saved in #Hadoop using #Apache #Spark

 2014-11-25 |  #Me

Apache Spark has just passed Hadoop in popolarity on the web (google trends) My first Apache Spark usage was extracting texts from tweets I’ve been collecting in Hadoop HDFS. My python script tweet-texts.

Continue reading 

Workflows in Apache Hadoop

 2014-10-03 |  #airflow #apache #bigdata #hadoop #luigi #oziee #workflow

How to orchestrate your Hadoop Jobs? Possible solutions are: Apache Oziee included in the top Hadoop distributions Azkaban from LinkedIn Luigi from Spotify Apache Airflow from AirBnb See for instance a comparison among luigi, airflow and pinball at http://bytepawn.

Continue reading 

#Hadoop Search with Apache #Solr

 2014-10-03 |  #Me

The two top Hadoop distributions (Cloudera and Hortonworks but remember that Hadoop is a Free Software and many companies do not pay anything for using it!) include Apache Solr as Hadoop search tool See apache-solr-hadoop-search article and the following two presentations from the two vendors [slideshare id=35810888&doc=solr-2-140612162029-phpapp02] [slideshare id=24255985&doc=hadoopplussolrbigdatasearch-130715115557-phpapp02] See also the Natural Language Processing and Sentiment Analysis for Retailers using HDP and ITC Infotech Radar article

Continue reading 

<<< 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 [17] 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 >>>