About this item
Data Science For Dummies 2nd Edition begins by explaining large data sets and data formats, including sample Python code for manipulating data. The book explains how to work with relational databases and unstructured data, including NoSQL. The book then moves into preparing data for analysis by cleaning it up or "munging" it. From there the book explains data visualization techniques and types of data sets. Part II of the book is all about supervised machine learning, including regression techniques and model validation techniques. Part III explains unsupervised machine learning, including clustering and recommendation engines. Part IV overviews big data processing, including MapReduce, Hadoop, Dremel, Storm, and Spark. The book finishes up with real world applications of data science and how data science fits into organizations.