Apache Tez for Hadoop >= 2.0

2014-03-12

The Apache Tez project is aimed at building an application framework which allows for a complex directed-acyclic-graph of tasks for processing data. It is currently built atop Apache Hadoop YARN. The 2 main design themes for Tez are:

  • Empowering end users by:
  • Expressive dataflow definition APIs
  • Flexible Input-Processor-Output runtime model
  • Data type agnostic
  • Simplifying deployment
  • Execution Performance
    • Performance gains over Map Reduce
    • Optimal resource management
    • Plan reconfiguration at runtime
    • Dynamic physical data flow decisions

Old way was: References:


Enter your instance's address


More posts like this

Any faster alternative to #Hadoop HDFS?

2016-11-17 | #apache #chep #file system #hadoop #lustre #opensource #s3

I’d like to have an alternative to Hadoop HDFS, a faster and not java filesystem: S3: S3 Support in Apache Hadoop if your servers are hosted at Amazon AWS chep: using hadoop with ceph glusterfs: managing hadoop compatible storage lustre: Running hadoop with lustre Openstack Swift: Hadoop OpenStack Support: Swift Object Store xstreamfs: there is an hadoop client Which is better?

Continue reading 