Data Analysis with Open Source Tools
eBook Details:
- Paperback: 538 pages
- Publisher: WOW! eBook; 1st edition (November 25, 2010)
- Language: English
- ISBN-10: 0596802358
- ISBN-13: 978-0596802356
eBook Description:
Data Analysis with Open Source Tools: A hands-on guide for programmers and data scientists
Turning raw data into something useful requires that you know how to extract precisely what you need. With this insightful book, intermediate to experienced programmers interested in data analysis will learn techniques for working with data in a business environment. You’ll learn how to look at data to discover what it contains, how to capture those ideas in conceptual models, and then feed your understanding back into the organization through business plans, metrics dashboards, and other applications.
Collecting data is relatively easy, but turning raw information into something useful requires that you know how to extract precisely what you need. With this insightful book, intermediate to experienced programmers interested in data analysis will learn techniques for working with data in a business environment. You’ll learn how to look at data to discover what it contains, how to capture those ideas in conceptual models, and then feed your understanding back into the organization through business plans, metrics dashboards, and other applications.
Author Philipp Janert teaches you how to think about data: how to effectively approach data analysis problems, and how to extract all of the available information from your data. Janert covers univariate data, data in multiple dimensions, time series data, graphical techniques, data mining, machine learning, and many other topics. He also reveals how seat-of-the-pants knowledge can lead you to the best approach right from the start, and how to assess results to determine if they’re meaningful.
Along the way, you’ll experiment with concepts through hands-on workshops at the end of each chapter. Above all, you’ll learn how to think about the results you want to achieve – rather than rely on tools to think for you.
These days it seems like everyone is collecting data. But all of that data is just raw information – to make that information meaningful, it has to be organized, filtered, and analyzed. Anyone can apply data analysis tools and get results, but without the right approach those results may be useless.
- Use graphics to describe data with one, two, or dozens of variables
- Develop conceptual models using back-of-the-envelope calculations, as well as scaling and probability arguments
- Mine data with computationally intensive methods such as simulation and clustering
- Make your conclusions understandable through reports, dashboards, and other metrics programs
- Understand financial calculations, including the time-value of money
- Use dimensionality reduction techniques or predictive analytics to conquer challenging data analysis situations
- Become familiar with different open source programming environments for data analysis
“Finally, a concise reference for understanding how to conquer piles of data.”
- Austin King, Senior Web Developer, Mozilla
“An indispensable text for aspiring data scientists.”
- Michael E. Driscoll, CEO/Founder, Dataspora
About the Author
Philipp K. Janert
After previous careers in physics and software development, Philipp K. Janert currently provides consulting services for data analysis, algorithm development, and mathematical modeling. He has worked for small start-ups and in large corporate environments, both in the U.S. and overseas. He prefers simple solutions that work to complicated ones that don’t, and thinks that purpose is more important than process. Philipp is the author of “Gnuplot in Action – Understanding Data with Graphs” (WOW! eBook), and has written for the O’Reilly Network, IBM developerWorks, and IEEE Software. He is named inventor on a handful of patents, and is an occasional contributor to CPAN. He holds a Ph.D. in theoretical physics from the University of Washington. Visit his company website at www.principal-value.com.
[download id=”277″ format=”1″]