0% found this document useful (0 votes)
5 views

Intro to DS Assignmnt 1 (Amna Iqbal)....

The document discusses three common tools in data science: Python, R, and Power BI, highlighting their use cases and advantages. Python is versatile for data manipulation, analysis, and visualization, R excels in statistical analysis and visualization, while Power BI is user-friendly for creating interactive dashboards and reports. Each tool serves distinct purposes within data science workflows, catering to different user needs and technical expertise.

Uploaded by

amna ch
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views

Intro to DS Assignmnt 1 (Amna Iqbal)....

The document discusses three common tools in data science: Python, R, and Power BI, highlighting their use cases and advantages. Python is versatile for data manipulation, analysis, and visualization, R excels in statistical analysis and visualization, while Power BI is user-friendly for creating interactive dashboards and reports. Each tool serves distinct purposes within data science workflows, catering to different user needs and technical expertise.

Uploaded by

amna ch
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 4

Assignmen # 01

Name:
Amna Iqbal
Roll No:
06
Course Title:
Intro to Data Science
Submitted TO:
Sir Bilal
Submitted BY:
Amna Iqbal
Session:
2022-2026
Date of Submission:
22-09-2024

Faculty OF COMPUTING AND ENGINEERING


Department of Data Science
University of Kotli Azad Jammu & Kashmir
o Discuss at least three tools (Python, R, SQL, Tableau, or Power BI) that
are commonly used in data science workflows.

o Explain their use cases and the advantages they offer for various data
science tasks such as data manipulation, analysis, and visualization.

Here’s an overview of three common tools—Python, R, and Power BI—used in data science
workflows, along with their use cases and advantages for data manipulation, analysis, and
visualization.
1. Python
 Use Cases:
 Data Manipulation: Python's Pandas library is one of the most popular tools for
data cleaning, reshaping, filtering, and merging datasets. It's ideal for handling
both structured and unstructured data.
 Data Analysis: With libraries like NumPy, SciPy, and StatsModels, Python
excels in performing statistical analysis, linear algebra, and hypothesis testing. It
also supports machine learning with tools like Scikit-learn and deep learning
with frameworks like TensorFlow and PyTorch.
 Data Visualization: Python supports both static and interactive visualization
through libraries such as Matplotlib, Seaborn, and Plotly.
 Advantages:
 Versatility: Python covers the entire data science pipeline—from data extraction
and cleaning to analysis, modeling, and visualization.
 Extensive Libraries: Python offers an extensive range of libraries, making it
suitable for everything from simple data manipulation to complex machine
learning and AI tasks.
 Scalability: Python integrates well with big data tools like Apache Spark and
Hadoop, making it suitable for large-scale data processing.
 Community Support: Python has a large user base, ensuring frequent updates
and extensive documentation.
2. R
 Use Cases:

 Statistical Analysis: R is renowned for its powerful statistical and mathematical


capabilities. It provides tools for conducting a wide range of statistical tests,
modeling, and simulations. Tidyverse, ggplot2, and dplyr are popular packages
for data manipulation and visualization.
 Data Visualization: R’s ggplot2 is a favorite for creating sophisticated
visualizations, especially for academic and research purposes, where precision is
critical.
 Data Manipulation: dplyr and data.table are commonly used for manipulating
and wrangling datasets, especially when working with large, complex datasets.

 Advantages:

 Specialized for Statistics: R is particularly useful for in-depth statistical analysis,


offering an array of built-in statistical functions that are not as readily available in
other languages.
 Strong Visualization: R's visualization packages, like ggplot2, create
publication-quality graphics, making it a preferred tool for research.
 Rich Ecosystem for Analytics: R has a strong focus on analytics and is used
extensively in academia and industries where statistical rigor is a priority.

3. Power BI
 Use Cases:

 Data Visualization: Power BI is primarily used for creating interactive


dashboards and reports. It allows users to visualize data from various sources and
share insights across an organization.
 Business Intelligence: Power BI is ideal for business intelligence applications, as
it enables non-technical users to generate insightful visual reports and perform
drill-down analysis.
 Data Exploration: Power BI allows users to easily explore large datasets and
create dynamic, real-time reports with minimal technical expertise.

 Advantages:

 Ease of Use: Power BI has an intuitive drag-and-drop interface, allowing non-


technical users to build reports and dashboards quickly.
 Integration with Microsoft Ecosystem: Power BI integrates seamlessly with
Excel, Azure, SQL Server, and other Microsoft tools, making it a great fit for
organizations already within the Microsoft ecosystem.
 Real-Time Data Analysis: Power BI supports real-time data streaming and
integration, making it highly useful for dynamic reporting in business
environments.

Comparison of Use Cases and Advantages for Data Science Tasks:


Task Python R Power BI
Data dplyr and Limited data manipulation
Pandas and NumPy for
Manipulation data.table for capabilities; mostly designed
flexible data wrangling and
handling complex for visualization and
transformations.
datasets. reporting.
Machine learning, statistical Not suitable for
Rich library ecosystem
Data analysis, and custom algorithm complex analysis;
for statistical and
Analysis development with Scikit-learn, focuses on presenting
mathematical modeling.
SciPy, and more. analyzed data.
Highly customizable ggplot2 excels in Best for interactive
Data visualizations with high-quality, dashboards and reports;
Visualization Matplotlib, Seaborn, and publication-ready ideal for business
Plotly. plots. intelligence.

In summary, Python is versatile and robust, covering all stages of data science workflows, R is
excellent for statistical analysis and research, while Power BI is focused on business intelligence,
making it easy for non-technical users to create interactive visual reports.

You might also like