Baixe o app para aproveitar ainda mais
Prévia do material em texto
Data Analysis with Python Full tutorial for beginners 1. What is Data Analysis 2. Real example Data Analysis with Python 3. How to use Jupyter Notebooks 4. Intro to NumPy (exercises included) 5. Intro to Pandas (exercises included) 6. Data Cleaning 7. Reading Data SQL, CSVs, APIs, etc 8. Python in Under 10 Minutes About this tutorial https://www.google.com/url?q=https://notebooks.ai/rmotr-curriculum/freecodecamp-pandas-real-life-example&sa=D&source=editors&ust=1666536252597231&usg=AOvVaw1m0pXX05VHt85oi2K2OwJR https://www.google.com/url?q=https://notebooks.ai/rmotr-curriculum/interactive-jupyterlab-tutorial-ac5fa63f/lab&sa=D&source=editors&ust=1666536252597618&usg=AOvVaw2LyosF2-1z1odcQSWVy3wP https://www.google.com/url?q=https://notebooks.ai/rmotr-curriculum/freecodecamp-intro-to-numpy-6c285b74&sa=D&source=editors&ust=1666536252597836&usg=AOvVaw2ZADKLXtOl7edRGGOOZCvy https://www.google.com/url?q=https://notebooks.ai/rmotr-curriculum/freecodecamp-intro-to-pandas-902ae59b&sa=D&source=editors&ust=1666536252598015&usg=AOvVaw2vjIUMrerV2SUkQfzjcWM8 https://www.google.com/url?q=https://notebooks.ai/rmotr-curriculum/data-cleaning-rmotr-freecodecamp&sa=D&source=editors&ust=1666536252598233&usg=AOvVaw2HYrE55dNc5P4OAAHb37wG https://www.google.com/url?q=https://notebooks.ai/rmotr-curriculum/python-under-10-minutes-15addcb2&sa=D&source=editors&ust=1666536252598467&usg=AOvVaw0BxqYbsDtHOyn3gzytl4eZ What is Data Analysis? > A process of inspecting, cleansing, transforming and modeling data with the goal of discovering useful information, informing conclusion and supporting decision-making. Definition by Wikipedia. What is Data Analysis https://www.google.com/url?q=https://en.wikipedia.org/wiki/Data_analysis&sa=D&source=editors&ust=1666536252972308&usg=AOvVaw0MPXID0_Fn0AXuns357pPl > A process of inspecting, cleansing, transforming and modeling data with the goal of discovering useful information, informing conclusion and supporting decision-making. Definition by Wikipedia. What is Data Analysis https://www.google.com/url?q=https://en.wikipedia.org/wiki/Data_analysis&sa=D&source=editors&ust=1666536253119409&usg=AOvVaw38UvgnoEc8jezTPtpwNrYG > A process of inspecting, cleansing, transforming and modeling data with the goal of discovering useful information, informing conclusion and supporting decision-making. Definition by Wikipedia. What is Data Analysis https://www.google.com/url?q=https://en.wikipedia.org/wiki/Data_analysis&sa=D&source=editors&ust=1666536253133323&usg=AOvVaw291rxiUVpBU8ZK-pzScyoo > A process of inspecting, cleansing, transforming and modeling data with the goal of discovering useful information, informing conclusion and supporting decision-making. Definition by Wikipedia. What is Data Analysis https://www.google.com/url?q=https://en.wikipedia.org/wiki/Data_analysis&sa=D&source=editors&ust=1666536253147212&usg=AOvVaw3m2wxZVsaKpvRABCfiYZh0 > A process of inspecting, cleansing, transforming and modeling data with the goal of discovering useful information, informing conclusion and supporting decision-making. Definition by Wikipedia. What is Data Analysis https://www.google.com/url?q=https://en.wikipedia.org/wiki/Data_analysis&sa=D&source=editors&ust=1666536253161034&usg=AOvVaw0ol6-QkqlKiATV-pgN1X2I Data Analysis Tools Auto-managed closed tools Programming Languages 👎 Closed Source 👎 Expensive 💸 👎 Limited 😩 👍 Easy to learn 👍 Open Source 🤩 👍 Free (or very cheap) 🤑 👍 Extremely Powerful 💪 👎 Steep learning curve Auto-managed closed tools Programming Languages Why Python for Data Analysis? Why Python for Data Analysis? Why would we choose Python over R or Julia? 👍 very simple and intuitive to learn 👍 “correct” language 👍 powerful libraries (not just for Data Analysis) 👍 free and open source 👍 amazing community, docs and conferences Python, sadly, is not always the answer ● When R Studio is needed ● When dealing with advanced statistical methods ● When extreme performance is needed When to choose R? The Data Analysis Process ● Building Machine Learning Models ● Feature Engineering ● Moving ML into production ● Building ETL pipelines ● Live dashboard and reporting ● Decision making and real-life tests ● Exploration ● Building statistical models ● Visualization and representations ● Correlation vs Causation analysis ● Hypothesis testing ● Statistical analysis ● Reporting ● Hierarchical Data ● Handling categorical data ● Reshaping and transforming structures ● Indexing data for quick access ● Merging, combining and joining data ● Missing values and empty data ● Data imputation ● Incorrect types ● Incorrect or invalid values ● Outliers and non relevant data ● Statistical sanitization Data Extraction Data Cleaning Data Wrangling Analysis ● SQL ● Scrapping ● File Formats ○ CSV ○ JSON ○ XML ● Consulting APIs ● Buying Data ● Distributed Databases Action Data Analysis Vs Data Science DATA ANALYSIS VS DATA SCIENCE The traditional view Python & PyData Ecosystem PYTHON ECOSYSTEM: The libraries we use... ● pandas: The cornerstone of our Data Analysis job with Python ● matplotlib: The foundational library for visualizations. Other libraries we’ll use will be built on top of matplotlib. ● numpy: The numeric library that serves as the foundation of all calculations in Python. ● seaborn: A statistical visualization tool built on top of matplotlib. ● statsmodels: A library with many advanced statistical functions. ● scipy: Advanced scientific computing, including functions for optimization, linear algebra, image processing and much more. ● scikit-learn: The most popular machine learning library for Python (not deep learning) https://www.google.com/url?q=https://pandas.pydata.org/&sa=D&source=editors&ust=1666536255988796&usg=AOvVaw2FE7_zfchepdg9qYnoGPMd https://www.google.com/url?q=https://matplotlib.org/&sa=D&source=editors&ust=1666536255989016&usg=AOvVaw3YSx_kc5wxkwLM1ntpGfZT https://www.google.com/url?q=https://numpy.org/&sa=D&source=editors&ust=1666536255989165&usg=AOvVaw3pYaktN_CaCET1i2ClpPg7 https://www.google.com/url?q=https://seaborn.pydata.org/&sa=D&source=editors&ust=1666536255989339&usg=AOvVaw1FafvplaxcemAdoMZ1XaTu https://www.google.com/url?q=https://www.statsmodels.org/stable/index.html&sa=D&source=editors&ust=1666536255989520&usg=AOvVaw2Yz_nTmjf7UTHasYN5ZH2D https://www.google.com/url?q=https://scipy.org/&sa=D&source=editors&ust=1666536255989659&usg=AOvVaw3XJCo8swV5bvO9QWzb9dLZ https://www.google.com/url?q=http://scikit-learn.org/stable/&sa=D&source=editors&ust=1666536255989829&usg=AOvVaw2yf7NVuoTxgz4nrm6aYnKo How Python Data Analysts Think EXCEL, TABLEAU, ETC. They’re all visual tools... Thinking like a Python Data Analyst And finally, why Python? 1. What is Data Analysis 2. Real Example Data Analysis with Python 3. How to use Jupyter Notebooks 4. Intro to NumPy (exercises included) 5. Intro to Pandas (exercises included) 6. Data Cleaning 7. Reading Data SQL, CSVs, APIs, etc 8. Python in Under 10 Minutes About this tutorial
Compartilhar