0% found this document useful (0 votes)
53 views3 pages

IMQAV

This document outlines the IMQAV framework for data science projects which includes ingesting, modeling, querying, analyzing, and visualizing data. It describes techniques for each step, such as using Kafka for ingestion, relational and non-relational databases for modeling, MapReduce for querying, NumPy and scikit-learn for analysis, and matplotlib for visualization.

Uploaded by

Electron Volt
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
53 views3 pages

IMQAV

This document outlines the IMQAV framework for data science projects which includes ingesting, modeling, querying, analyzing, and visualizing data. It describes techniques for each step, such as using Kafka for ingestion, relational and non-relational databases for modeling, MapReduce for querying, NumPy and scikit-learn for analysis, and matplotlib for visualization.

Uploaded by

Electron Volt
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 3

11/4/22, 12:10 PM UMQAV - Jupyter Notebook

NumPy, Data Science, and IMQAV


Ingest
Model
Query
Analyze
Visualize

Application of IMQAV
Organization
Architecture
Set of Tasks

Ingest
Ingestion is a set of software engineering techniques to adapt high volumes of data that arrive rapidly (often via
streaming).

Kafka
RabbitMQ
Fluentd
Sqoop
Kinesis (AWS)

Model
Modeling is a set of data architecture techniques to create data storage that is appropriate for a particular
domain.

Relational
MySQL
Postgres
RDS (AWS)
Key Value
Redis
Riak
DynamoDB (AWS)
Columnar
Casandra
HBase
RedShift (AWS)
Document
MongoDB
ElasticSearch
CouchBase
Graph
localhost:8888/notebooks/Desktop/Ex_Files_NumPy_Data_EssT/Exercise Files/Ch 0/00_03/Finish/UMQAV.ipynb 1/3
11/4/22, 12:10 PM UMQAV - Jupyter Notebook
p
Neo4J
OrientDB
ArangoDB

Query
Query refers to extracting data (from storage) and modifying that data to accommodate anomalies such as
missing data.

Batch
MapReduce
Spark
Elastic MapReduce (AWS)
Batch SQL
Hive
Presto
Drill
Streaming
Storm
Spark Streaming
Samza

Analyze
Analyze is a broad category that includes techniques from computer science, mathematical modeling, artificial
intelligence, statistics, and other disciplines.

NumPy is included within 'Analyze'


Statistics
SPSS
SAS
R
Statsmodels
SciPy
Pandas
Optimization and Mathematical Modeling (SciPy and other libraries)
Linear, Integer, Dynamic, Programming
Gradient and Lagrange methods
Machine Learning
Batch
H2O
Mahout
SparkML
Interactive
scikit-learn

Visualize
localhost:8888/notebooks/Desktop/Ex_Files_NumPy_Data_EssT/Exercise Files/Ch 0/00_03/Finish/UMQAV.ipynb 2/3
11/4/22, 12:10 PM UMQAV - Jupyter Notebook

Visualize refers to transforming data into visually attractive and informative formats.

matplotlib
seaborn
bokeh
pandas
D3
Tableau
Leaflet
Highcharts
Kibana

In [ ]:

localhost:8888/notebooks/Desktop/Ex_Files_NumPy_Data_EssT/Exercise Files/Ch 0/00_03/Finish/UMQAV.ipynb 3/3

You might also like