Pipelines, Functions, Oops
Pipelines, Functions, Oops
2. Types of Functions
4. Generators
6. Python Pipelines
Defined (Lambda)
else: Functions Functions count += 1
return n * factorial(n - 1)
def square(x):
import asyncio return x * x
Recursive Generator
Functions Functions
async def say_hello(): nums = [1, 2, 3, 4]
await asyncio.sleep(1) sq_nos = map(square, nums)
print("Hello!")
Asynchron Higher- class MyClass:
ous Order
asyncio.run(say_hello()) Functions Functions @staticmethod
def static_method():
def my_decorator(func): print("This is a static method.")
def wrapper(): Static and
func() Decorators Class @classmethod
Methods
return wrapper def class_method(cls):
@my_decorator print("This is a class method.")
def say_hello():
print("Hello!") MyClass.static_method()
MyClass.class_method()
say_hello()
1. Built-in Functions : pre-defined in Python and are always available to use.
2. User-Defined Functions : Used defines and creates them using the def keyword.
3. Anonymous (Lambda) Functions : Small, one-line functions using the lambda keyword.
They don’t require a def keyword or a name.
4. Recursive Functions : Functions that call themselves to solve smaller instances of the
Types of Functions
same problem.
5. Higher-Order Functions : Functions that take other functions as arguments or return
functions as their result.
6. Generator Functions : Special functions that return a generator object. They use yield
instead of return to produce a series of values lazily, one at a time.
7. Decorators : Functions that modify the behavior of other functions. They take a function as
an argument and return a new function with additional or altered behavior.
8. Static and Class Methods
•Static Methods: Defined with @staticmethod decorator, they don’t require access to the
instance or class and behave like regular functions but belong to the class's namespace.
•Class Methods: Defined with @classmethod decorator, they take the class (cls) as the
first parameter and can modify class state.
9. Asynchronous Functions (Async/Await)
These are functions defined using async def, allowing for asynchronous programming. They
can perform non-blocking operations with the use of await.
User-Defined Functions :
Enhances the Maintainability, Modularity, Reusability, Readability upon development
User Defined Functions
Streaming Data: Process streams of data, such as log files or network responses, incrementally.
Lazy Evaluation: Infinite Sequences: Generate Fibonacci numbers or prime numbers
Memory Efficiency: Large Datasets: Work by generating values on-the-fly w/o storing in memory.
Pipelines: Data Pipelines: where each stage yields data to the next stage.
Stateful Iteration: Implement custom iterators that maintain their state b/n iterations, allowing
complex iteration logic.
Backtracking Algorithms: Search Problems: Solve problems that require backtracking, such as
generating permutations or combinations, where the generator can pause and resume.
Caching and Memorization: Use generators to cache results of expensive computations and yield
them as needed.
Class :
A Blueprint for creating objects.
It defines a set of attributes and methods that the created objects (instances) will have.
Object :
Class, Objects & OOPS
An instance of a class.
A self-contained entity, that consists of attributes (variables) & methods (functions) defined by its
class.
Inheritance :
A mechanism by which one class (child or subclass) can inherit attributes and methods from
another class (parent or superclass).
This allows for code reuse and the creation of a hierarchical relationship between classes.
Polymorphism:
The ability of different classes to be treated as instances of the same class through inheritance.
It allows a single method to behave differently based on the object that it is acting upon.
Encapsulation:
The practice of bundling the data (attributes) and methods that operate on the data into a single
unit, or class, and restricting access to some of the object's components. This is usually done by
making attributes private (using an underscore _) and providing public methods to access or
modify them
What is a Pipeline ?
A series of data processing steps that are connected together, where the output of one step
becomes the input for the next.
Why Pipeline ?
Introduction to Pipelines
Avoiding manual intervention in repetitive tasks to reduce errors and increase productivity.
Real-world example :
Scenario:
Imagine a company that needs to regularly analyze customer data to predict future purchasing
trends. Without a pipeline, this process would involve manually cleaning the data, selecting
features, and running models each time new data is available.
Pipeline Solution:
By creating a data pipeline, the company can automate the entire process: data cleaning, feature
selection, model training, and evaluation are all done automatically whenever new data is added.
This not only saves time but also ensures that the process is consistent and repeatable.
Advantages of using Pipelines Advantages of Pipelines
Automation
Reduces manual intervention, making the process more efficient & less error-prone.
Consistency
Ensures that the same transformations are applied to training and test data.
Modularity
Simplifies process of modifying individual components w/o affecting entire pipeline.
Reusability
Pipelines can be reused across different projects or datasets.
Scalability
Facilitates scaling the process for large datasets and more complex models.
I. Data Collection:
The initial step involves gathering data from various sources. This could include databases,
files, APIs, or web scraping.
II. Data Preprocessing:
Pipeline @ various stages
Arithmetic Operations
Container Methods
Callable Objects
•__copy__(self): Defines behavior for copying objects using the copy module.
•__deepcopy__(self, memo): Defines behavior for deep copying objects.
Usages of Dunder Methods
•__new__(cls, ...): Defines behavior for creating a new instance of a class, called before __init__.
•__hash__(self): Defines behavior for hashing an object (used in hash-based collections like sets
and dictionaries).
Miscellaneous