About this item

Many corporations are finding that the size of their data sets are outgrowing the capability of their systems to store and process them. The data is becoming too big to manage and use with traditional tools. The solution: implementing a big data system.As Big Data Made Easy: A Working Guide to the Complete Hadoop Toolset shows, Apache Hadoop offers a scalable, fault-tolerant system for storing and processing data in parallel. It has a very rich toolset that allows for storage (Hadoop), configuration (YARN and ZooKeeper), collection (Nutch and Solr), processing (Storm, Pig, and Map Reduce), scheduling (Oozie), moving (Sqoop and Avro), monitoring (Chukwa, Ambari, and Hue), testing (Big Top), and analysis (Hive).The problem is that the Internet offers IT pros wading into big data many versions of the truth and some outright falsehoods born of ignorance.



About the Author

Michael Frampton

I am keenly interested in new technologies and have investigated the "big data" domain over the last few years. I have now written three big data books covering the Hadoop, Spark and Mesos/DCOS areas. I tend to concentrate on big data integration of full stacks rather than individual components.

My IT history is conventional, I have been involved in development, maintenance, support and test over a variety of functional domains since 1990. My LinkedIn profile can be found here.

nz.linkedin.com/pub/mike-frampton/20/630/385

I am always interested to make new contacts, to hear about your projects and to hear about new opportunities. Please feel free to contact me either via LinkedIn or via my website at

semtech-solutions.co.nz



Read Next Recommendation

Report incorrect product information.