Data Analytics Report
Data Analytics Report
ON
DATA ANALYTICS PROCESS AUTOMATION
VIRTUAL INTERNSHIP
Submitted By
Name: B. Durga Sravani
Regd.No: 21K61A0419
Submitted to
DEPARTMENT OF ELECTRONICS AND COMMUNICATION
ENGINEERING
SASI INSTITUTE OF TECHNOLOGY & ENGINEERING
(Approved by AICTE, New Delhi, Permanently Affiliated to JNTUK, Kakinada and
SBTET-Hyderabad, Accredited by NAAC with ‘A’ Grade, Ranked as "A" Grade by
Govt.of .AP., Recognized by UGC 2(f) & 12(B))Kadakatla, Tadepalligudem–534101
Academic Year 2023-2024
DECLARATION
I B. Durga Sravani , 21K61A0419,student of Electronics and
Communication Engineering at Sasi Institute of Technology &
Engineering, Tadepalligudem hereby declare that the Summer Training
Report entitled “Data Analytics Process Automation Virtual
internship” an authentic record of my own work as requirements of
Industrial Training during the period from date to final date. I obtained
the knowledge through the selfless efforts of the Employee arranged to
me by administration. A Training Report was made on the same and the
suggestions given by the faculty were duly incorporated.
B.Durga Sravani
21K61A0419
SIGNATURE OF SIGNATURE OF
Incharge Head of the Department
ACKNOWLEDGEMENT
I would like to thank the entire Edu Skills, India. Who has provided me with this
summer training. I express my sincere thanks to Mr. P. Srinivasa Sharma, Director
for giving me a great opportunity to work in such a domain.
With Gratitude,
B.Durga Sravani
21K61A0419
Vision & Mission
Vision of the Institute
❖ Confect as a premier institute for professional education by creating technocrats
who can address the society's needs through inventions and innovations.
Mission of the Institute
These PEO’s are meant to prepare our students to thrive and to lead in their
career. Our graduates will be able
Graduates will have strong knowledge about IT applications
with leadership qualities
P1
12.Lifelong learning: Recognize the need for, and have the preparation and
ability to engage in independent and life-long learning in the broadest context of
technological change
PAGE
CHAPTER TITLE NO
Abstract
1 Introduction
Process Automation
2 System Requirement
Technical Description
APPENDIX-A
APPENDIX-B
ABSTRACT
Many organizations are seeking the next step in modernizing their office of finance. Intelligent
Automation for Data Analytics Process Automation solutions on Alteryx offer organizations a
codefree, analytic automation platform that easily integrates various data sources and advanced
analytic technologies to leverage process automation. With these solutions, finance teams can
innovate and modernize tax and financial audit processes with automated self-service analytics that
unify dataprep, analytics, and process automations for greater simplicity.
In this project we study Alteryx Designer tools and perform Data analytic Process by using tools
provided by Alteryx. Alteryx unifies analytics, data science and business process automation in one,
end-to-end platform to accelerate digital transformation. Our unified platform and self-service,
easy-to-use interface differentiates us from any other analytics or data sciences solution on the
market.
The Alteryx - Data Analytics Automation Platform abstracts complexity so that anyone can
independently develop their own paths to better decisions. Collectively, this improves analytics
maturity and leads to higher value outcomes. Four proficiencies found by IDC Research necessary
to realize the greatest ROI from analytics investments are perfectly aligned with the capabilities and
services available in the Alteryx Analytics Automation Platform.
1
CHAPTER 1
INTRODUCTION
Introduction of Data Analytics Process Automation
Data Analytics Process Automation is the process of using advanced computer programs and
simulations to examine digital information. Depending on a business' industry, its staff might collect
statistical data on customer information, production processes, profitability or performance metrics.
Using this data to inform important business decisions can help keep a business profitable, but
analyzing these data points manually can be time-consuming and costly.
Automated analytics systems save time and funds, as you can input data directly into software that
generates reports and makes recommendations based on user preferences. This type of automation
is particularly useful for companies that handle big data, as there could be many individual data
points to analyze on a day-to-day basis. By working with automation software, business owners can
produce more reliable results while prioritizing funds for other projects.
Alteryx Introduction
Alteryx Analytic Process Automation (APA) unifies analytics, data science and process automation
in one, end-to-end platform to accelerate digital transformation and rapidly upskill the modern
workforce. The Alteryx APA Platform TM provides hundreds of automations building blocks for
data prep and blending, diagnostic and predictive analytics, auto ML, and code-free data science.
The no-code/low-code, self-service platform requires no specialized skillsets and is designed to put
automation in the hands of all data workers; it can automate analytics and data science pipelines,
manage complex data-centric business processes and deliver actionable insights to stakeholders in
every line of business. Thousands of organizations globally use Alteryx to deliver quick wins and
high- impact business outcomes.
Alteryx Designer: Alteryx Designer allows analysts and data scientists alike to prep, blend, create
statistical, predictive, and forecasting models. They can enrich their data and easily analyze their
data and build analytics apps, dashboards and models to share with others.
2
Alteryx Designer empowers data analysts by combining data preparation, data blending, and
analytics—predictive, statistical, and spatial—using the same intuitive user interface.
Alteryx Server: Alteryx Server is a secure and scalable server-based product for scheduling, sharing
and running apps, dashboards and models created in Alteryx Designer for others in the organization
to leverage.
3
CHAPTER 2
SYSTEM REQUIREMENTS
Required Minimums for Alteryx Designer
Machine Requirements Minimum: 64-bit High Performance: 64-bit
RAM 8 GB 16 GB
CHAPTER 3
TECHNICAL DESCRIPTION
Overview of Alteryx Designer
In this project we study Alteryx Designer tools and perform Data analytic Process by using tools
provided by Alteryx.
4
Alteryx Designer contains four Primary components used to construct workflow.
1. Tool Palette
2. Canvas
3. Configuration Window
4. Results Window
5
Tool Palette
The Tool Palette consists of tools organized into tool categories. To build a workflow, click a tool
on the Tool Palette and drag it onto the workflow window. You can also add tools to a workflow by
right-clicking the workflow window, selecting Insert, and choosing a tool from a list of tool category
names. The Tool Palette contains tools that are use to alter data. Tools are separated into categories
based on types of function and performance such as –
1. In / Out
2. Preparation
3. Join
4. Parse
5. Transform
6. In-Database
7. Reporting
8. Documentation
6
1.In/Out
Each workflow must contain inputs and outputs. Both the Input and Output tool have different
configuration properties, depending on the file type. The Browse tool offers a temporary view of
what the data looks like in table, map, or report format. Click each tool to find out more.
Browse: The Browse tool offers complete views of underlying data within the Alteryx
workflow. A browser can be outputted via a Browse tool to view the resulting data anywhere within
the workflow stream.
Date Time Now: This Macro will return a single record: the Date and Time at the workflow
runtime, and convert the value into the string format of the user's choosing.
Directory: The Directory tool returns all the files in a specified directory. Along with file
names, other pertinent information about each file is returned, including file size, creation date, last
modified, and much more.
Input Data: The Input Data tool can be the starting point for any project in Alteryx.
Every project must have an input and output. The input tool opens the source data to be
used in the analysis. The input tool reads information from the following file formats: CSV, MDB,
DBF, XLS, MID/MIF, SHP, TAB, GEO, SZ, YXDB, SDF, FLAT, OleDB, Oracle Spatial.
Map Input: Manually draw or select map objects (points, lines, and polygons) to be stored
in the workflow.
Output Data: The output tool is used anytime results are needed to be output to a file from
the analysis. Every project must have an input and output. The output tool opens the data results
derived to from the analysis. The Output tool will write the results of the analysis to the same variety
of formats specified for the input tool.
7
Text Input: The Text Input tool makes it possible for the user to manually type text to create
small data files for input. It is useful for creating Lookup tables on the fly, for example.
2.Preparation
The Preparation category includes tools that prepare data for downstream analysis.
Auto Field Strings: The Auto Field tool reads through an input file and sets the field type to
the smallest possible size relative to the data contained within the column.
Create Samples: This tool splits the input records into two or three random samples. In the
tool you specify the percentage of records that are in the estimation and validation samples. If the
total is less than 100%, the remaining records fall in the holdout sample.
Data Cleansing: The Data Cleansing tool can fix common data quality issues using a variety
of parameters.
Date Filter: The Date Filter macro is designed to allow a user to easily filter data based on a
date criteria using a calendar based interface.
Filter: The Filter tool queries records in your file to meet specified criteria. The tool creates
two outputs, True and False. True is where the data met the specified criteria, False is where it does
not.
Formula: The formula tool is a powerful processor of data and formulas. Use it to add a field
to an input table, to create new data fields based on an expression or by assigning a data relationship,
or to update an existing field based on these same premises.
8
Generate Rows: The Generate Rows tool will create new rows of data, at the record level.
This tool is useful to create a sequence of numbers, transactions, or dates.
Imputation: The Imputation tool updates specific values in a numeric data field with another
selected value. It is useful for replacing NULL values.
Multi-Field Binning: The Multi-Field Binning tool groups multiple numeric fields into tiles
or bins, especially for use in predictive analysis.
Multi-Field Formula: The Multi-Field Formula tool makes it easy to execute a single
function on multiple fields.
Multi -Row Formula: The Multi-Row Formula tool takes the concept of the Formula Tool a
step further, allowing the user to utilize row data as part of the formula creation. This tool is useful
for parsing complex data, and creating running totals, averages, percentages and other mathematical
calculations.
Oversample Field: This tool will sample incoming data so that there is equal representation
of data values so they can be used effectively in a predictive model.
Record ID: The Record ID tool creates a new column in the data and assigns a unique
identifier, that increases sequentially, for each record in the data.
Sample: The Sample tool extracts a specified portion of the records in the data
stream.
9
Select: The Select tool is a multi-function utility that allows for selected fields to be carried
through downstream, renaming fields, reordering field position in the file, changing the field type,
and loading/saving field configurations.
Select Records: The Select Records tool selects specific records and/or ranges of records
including discontinuous ranges. It is useful for troubleshooting and sampling.
Sort: The Sort tool arranges the records in a table in alphanumeric order, based on the values
of the specified data fields.
Tile: The tile tool assigns a value (tile) based on ranges in the data.
Unique: The Unique Tool distinguishes whether a data record is unique or a duplicate by
grouping on one or more specified fields, then sorting on those fields. The first record in each group
is sent to the Unique output stream while the remaining records are sent to the Duplicate output
stream.
3.Join
The Join category includes tools that Join two or more streams of data by appending data to wide or
long schemas.
Append Fields: The Append Fields tool will Append the fields of one small input (Source)
to every record of another larger input (Target ). The result is a Cartesian Join where all records
from both inputs are compared.
Find Replace: The Find and Replace tool searches for data in one field from the input table
and replaces it with a specified field from a different data table.
10
Fuzzy Match: The Fuzzy Matching tool can be used to identify non-identical duplicates of a
database by specifying parameters to match on. Values need not be exact to find a match, they just
need to fall within the user specified or prefabricated parameters set forth in the configuration
properties.
Join:The Join tool combines two inputs based on a commonality between the two tables. Its
function is like a SQL join but gives the option of creating 3 outputs resulting from the join.
Join Multiple: The Join Multiple tool combines two or more inputs based on a commonality
between the input tables. Only the joined records are outputted through the tool, resulting in a wide
(columned) file.
Make Group: The Make Group tool takes data relationships and assembles the data into
groups based on those relationships.
Union: The Union tool appends multiple data streams into one unified steam. The tool accepts
multiple inputs based on either field name or record position, creating a stacked output table.
The user then has complete control to how these fields stack or match up. 4.Parse
The Parse tools separate data values into a standard table schema.
DateTime: The Date Time tool standardizes and formats date/time data so that it can be used
in expressions and functions from the Formula or Filter tools.
RegEx: The Regular Expression tool is a robust data parser. There are four types of output
methods that determine the type of parsing the tool will do. These methods are explained in the
Configuration Properties.
11
Text to Columns: The text to columns tool takes the text in one column and splits the string
value into separate, multiple fields based on a single or multiple delimiter (s).
XML Parse:The XML parse tool will read in a chunk of Extensible Markup Language and
parse it into individual fields.
5.Transform
Arrange: The Arrange tool allows you to manually transpose and rearrange your data fields
for presentation purposes. Data is transformed so that each record is turned into multiple records
and columns can be created by using field description data or manually created.
Count Records: This Macro returns a count of how many records are going through the
tool.
Cross Tab: The CrossTab pivots the orientation of the data table. It transforms the data so
vertical data fields can be viewed on a horizontal axis, summarizing data where specified.
Running Total: The Running Total tool calculates a cumulative sum, per record, in a file.
Summarize: The Summarize tool can conduct a host of Summary Processes, including:
grouping, summing, count, spatial object processing, string concatenation, and much more.
Transpose: The Transpose tool pivots the orientation of the data table. It transforms the data
so you may view Horizontal data fields on a vertical axis.
12
Weighted Average: This Macro will calculate the weighted average of an incoming data
field. A weighted average is similar to a common average, but instead of all records contributing
equally to the average, the concept of weight means some records contribute more than others.
6.In-Database
The In-Database tool category consists of tools that function like many of the Favorites. This
category includes tools for connecting to a database and blending and viewing data, as well as tools
for bringing other data into an In-Database workflow and writing data directly to a database.
Browse In-DB: Review your data at any point in an In-DB workflow. Note: Each Browse In-
DB triggers a database query and can impact performance.
Data Stream In: Bring data from a standard workflow into an In-DB workflow.
Data Stream Out: Stream data from an In-DB workflow to a standard workflow, with an
option to sort the records.
Dynamic Input In-DB: Take In-DB Connection Name and Query fields from a standard data
stream and input them into an In-DB data stream.
Dynamic Output In-DB: Output information about the In-DB workflow to a standard
workflow for Predictive In-DB.
13
Filter In-DB: Filter In-DB records with a Basic filter or with a Custom expression using the
database’s native language (e.g., SQL).
Formula In-DB: Create or update fields in an In-DB data stream with an expression using
the database’s native language (e.g., SQL).
Join In-DB: Combine two In-DB data streams based on common fields by performing an
inner or outer join.
Union In-DB: Combine two or more In-DB data streams with similar structures based on
field names or positions. In the output, each column will contain the data from each input.
7.Reporting
The Reporting category includes tools that aid in data presentation and organization.
Charting: The Charting tool allows the user to display data in various chart types.
Email: Allows you to select from fields inputted to e-mail to recipients instead of having to
use a batch e-mail as before. Automatically detects SMTP address, and will allow attachments or
even e-mail generated reports.
Image: The Image Tool allows the user to add graphics to reports.
14
Layout:The Layout Tool enables the user to arrange Reporting Snippets.
Map Legend Builder: This macro takes the components output from the Legend Splitter
macro and builds them back into a legend table. If you add a Legend Builder tool immediately after
a Legend Splitter tool, the resulting legend will be the same as the legend output originally from the
Map tool. The purpose of the two macros is that you can change the data between them and therefore
creating a custom legend
Overlay: This tool arranges reporting snippets on top of one another for output via the Render
tool.
Render:The Render tool transforms report Snippets into presentation-quality reports in PDF,
HTML, XLSX, DOCX, RTF and Portfolio Composer (*.pcxml) formats.
Report Footer: This macro will allow a user to easily setup and put a footer onto their
report.
Report Header: This macro will allow a user to easily setup and put a header onto their
report.
Report Map: The Map Tool enables the user to create a map image from the Alteryx GUI.
The tool accepts multiple spatial inputs, allows for layering these inputs, and supports thematic map
creation. Other cartographic features can be included such as a legend, scale and reference layers.
Report Text: The Text tool allows the user to add text to reports and documents.
15
Table:The Table tool allows the user to create basic data tables and pivot tables from their
input data.
8.Documentation
Comment: The Text Comment tool adds annotation to the project workspace. This is useful
to jot down notes, explain processes to share or reference later.
Explorer Box:The Explorer Box is populated with a web page or file location of the user's
specification.
Tool Container: The Tool Container allows the user to organize tools in a workflow. Tools
can be placed inside the container to isolate a process. The container can then be collapsed, expanded
or disabled.
16
CHAPTER 4 ALTERYX TOOL IMPLEMENTATION
A workflow consists of connected tools that perform different functions to process data. When you
build a workflow, you add and connect tools. You also configure those tools and workflow
properties. To build a new workflow select File > New Workflow.
Drag a tool from the Tool Palette onto the workflow canvas to begin building a workflow. To connect
tools, select an output anchor and drag the connector arrow to the next tool's input anchor. The
Favorites category includes the most common tools used in workflow creation. To add a tool to the
Favorites category select the star in the top right of the tool icon on the Tool Palette. A yellow star
indicates a tool is already added to the Favorites category.
Workflow Configuration
To add a tool to a workflow, select any tool from the tool palette and drag it onto the workflow
canvas, or right-click the workflow to access a menu to insert tools. Go to Tool Categories for
more information.
To remove a tool from a workflow, select the tool, and use the Delete key on your keyboard.
Connect Tools
To connect tools in a workflow, drag a tool from the tool palette onto the canvas near the output
anchor of another tool. You can also drag the output anchor from an existing tool to the tool you
just added.
Connections go in through the left side (or top) of a tool and out through the right side (or bottom)
of a tool. Some tools accept multiple inputs indicated by multiple input anchors. Some tools have
optional inputs indicated by a gray input anchor. All tools with an output anchor can be output to
multiple streams.
Select a tool to display the incoming and outgoing connector indicators. The connector input to a
tool displays in green. The connector output from a tool displays in blue.
17
Workflow Tab Right-Click Options
Select a workflow tab on the canvas to open it, then right-click the tab to display a menu with these
options:
Designer can change path dependencies from relative to absolute. Go to Workflow Dependencies
for more information.
Select a tool on the canvas, then right-click to display a menu with these options:
18
● Cache and Run Workflow: Runs the workflow and caches all data up to the selected tool.
You can use multiple caches in a single workflow. When you interact with a tool that has
been cached, you can view the tool's configuration without clearing the cache.
However,changing the tool's configuration will release the cached data. Alteryx clears
caches when you close the workflow. To cache a tool...
● Align Horizontally: The selected tools align horizontally with the selected tool that was
placed on the canvas first.
● Align Vertically: The selected tools align vertically with the selected tool that was placed on
the canvas first.
● Distribute Horizontally: The selected tools are arranged to have even space between them
along the horizontal axis.
● Distribute Vertically: The selected tools are arranged to have even space between them along
the vertical axis.
19
Annotate a Workflow
Add annotations to individual tools in the tool Configuration window > Annotation, or add a
Comment tool to the workflow.
Canvas Options
● Run Workflow: Run the workflow. You can also use Ctrl+R to run the workflow. The icon
changes to Stop Workflow while the workflow is running.
● Stop Workflow: Stop the workflow. You can also use Ctrl+R to stop the workflow.
● Run As Analytic App: Run the analytic application.
● Add Workflow to Schedule: Schedule workflows to run at specific times and frequencies.
● Active Documents: Show open workflows, apps, or macros.
● New Blank Workflow: Create a new workflow (.yxmd).
● Zoom In: Increase the normal zoom by 3/2.
● Zoom Out: Decrease the normal zoom by 2/3.
Save Files
Designer prompts you to save unsaved workflows when you have one or more unsaved workflows
and you attempt to exit the application. Designer displays Save changes to the following
workflows? and provides these options:
● Save Selected: Select the workflows you want to save and then select Save Selected.
Designer saves the selected workflows and closes. Any workflows you do not select are not
saved.
● Discard Changes: Select Discard Changes to discard all changes and exit without saving.
● Cancel: Select Cancel to return to the canvas.
20
CHAPTER 5 RESULT AND ANALYSIS
Workflow of data analysis
21
Final output of data analysis
22
CHAPTER 6 CONCLUSION AND FUTURE WORK
Alteryx Designer is a windows software application that gives you an intuitive, fun and easy to use
drag-and-drop user interface in order to create repeatable workflow processes for analyzing,
blending data, and performing advanced analytics (such as predictive, spatial, and prescriptive). You
can drag a tool from a tool palette onto a canvas to connect those tools to each other in a process
flow that results in one of 3 results: A workflow, an analytic app, or a macro. You can also use these
processes you create to quickly and automatically produce results that can be easily shared with
others.
Alteryx Designer is an extremely powerful, scalable, and dynamic software application that has
been and continues to be in a state of growth, transition, and transformation as a stand-alone product.
To classify Alteryx Designer as a stand-alone product however would not be entirely correct.
Alteryx Designer as well as other products like Alteryx Server, Alteryx Connect, and Alteryx
Promote all form what is referred to as the Alteryx Analytic Process Automation Platform or Alteryx
APA Platform for short.
Flexibility
Designer’s extensive library of tools make once difficult tasks easier as well as providing an
unlimited possibility for connecting each of the tools together to achieve the desired results.
Flexibility around the vast number of sources that can be connected to and updated. If you want to
just clean up some data that is formatted poorly then you can easily do that but if you also want to
connect to a website, download a table of data and process it further you can do that too!
Breadth of Solutions
Designer’s flexibility leads to the fact that it has a vast variety of workflows that can be built in
order to address the widest number of use cases. With Alteryx Designer you can handle problems
in many different business areas and analytics disciplines such as Customer Success, HR, Finance,
Operations, Sales & Support, Marketing, and IT.
23
Support Network
Alteryx is known to have a somewhat cult-like following amongst its users of which I include myself
in that group. Many of the tens of thousands of Alteryx community users love supporting each other
on the community as we affectionately call it and the Alteryx community has won awards as well.
On top of the users that make it such an inviting place, Alteryx itself has done an excellent job on
providing resources and training material for users getting started or learning about a new category
of tools and capabilities in Alteryx Designer.
24
Appendix A
INDUSTRIAL INTERNSHIP EVALUATION FORM
For the Students of B.Tech. (ECE), Sasi Institute of Technology
&Engineering, Tadepalligudem, West Godavari District, Andhra
Pradesh
Date:
Evaluate this student intern on the following parameters by checking the appropriate attributes.
Attributes
Attendance
(Punctuality)
Productivity
(Volume, Promptness)
25
Quality of Work
(Accuracy,
Completeness,
Neatness)
Initiative
(Self-Starter,
Resourceful)
Attitude
(Enthusiasm, Desire to
Learn)
Interpersonal Relations
(Cooperative,
Courteous, Friendly)
Ability to Learn
(Comprehension of
New Concepts)
Use of Academic
Training
(Applies
Education
to Practical
Usage)
Communications Skills
26
Judgement
(Decision Making)
Areas where students gained new skills, insights, values, confidence, etc.
27
Points Awarded
Overall Evaluation of the Intern’s
Performance
(Evaluation Scale shown below)
Evaluation Scale:
Attributes Excellent Very Good Satisfactory Poor
Good
Points
5 4 3 2 1
Designation :
Signature of
Incharge(Guide/
Supervisor)
28
Appendix B
PO's And PSO's Relevance
With Internship Work
29
Conduct investigations of complex Investigation of various
problems: Research based problems of farmers
PO4
knowledge and research methods
including design of experiments,
analysis and interpretation of data
and synthesis
with an understanding of
the limitations.
30
Ethics: Apply ethical principles and Able to identify standard
commit to professional ethics and norms
PO8
responsibilities and norms of the
engineering practice.
31
Life-long learning: Recognize the It is a endless learning procedure
need for and have the preparation and because entrepreneur should
PO12
ability to engage in independent and learn everyday from everything.
life-long learning in the
broadest context of technological
change.
Application Development An application that helps
farmers
PSO1
32
33