Mid Prep Data
Mid Prep Data
Expression: a combination of values and operations that creates a new value that
we call a return value - i.e. the value returned by the operation(s).
Statement: doesn't return a value, but does perform some task. Some statements
may control the flow of the program, and others might ask for resources.
Indentation
Indentation is used by all programmers to make code more readable.
Programmers often indent code to indicate that the indented code grouped
together, meaning those statements have some common purpose. However,
indentation is treated uniquely in Python. Python requires it for grouping.
Naming Objects
Python types:
Int
Float - floating point values or real numbers
Boolean - True or False
String - 'any text'
List - [4, 3.26, 'Baku']
Dictionary - {'Namiq' : '15-March-1989', 'Rasim' : '14-June-2008'}
Tuple - (1, 2)
Set - {'ag', 'qara', 'qirmizi'}
For integer numbers result of division / is float, while others are int type(+; -; *);
// - is specifically integer division, return float
% returns integer
Strings:
String-
any sequence of printable characters is referred to as a string
concatenate +: The operator + requires two string objects and creates a new string
object. The new string object is formed by concatenating copies of the two string
objects together: the first string joined at its end to the beginning of the second
string.
repeat *: The * takes a string object and an integer and creates a new string
object. The new string object has as many copies of the string as is indicated by
the integer.
Methods
A method is a variation on a function. It looks very similar. It has a name and it has
a list of arguments in parentheses. It differs, however, in the way it is invoked.
Every method is called in conjunction with a particular object. The kinds of
methods that can be used in conjunction with an object depends on the object’s
type. String objects have a set of methods suited for strings, just as integers have
integer methods, and floats have float methods. The invocation is done using what
is called the dot notation.
Lists:
A list can contain elements other than characters. In fact, a list can contain a
sequence of elements of any type, even different typed elements mixed together
in the same list.
A list is a mutable type. This means that, unlike a string object, a list object can be
changed after it is initially created.
Tuples- immutable lists. So function or method cannot change it. Tuple is written
inside () and elements are separated with " ," . Tuples are efficient to tome, space
and algorithm so they are being used.
The operations familiar from other sequences (lists and strings) are available,
except, of course, those operators that violate immutability.
Operators such as + (concatenate) and * (repeat) work as before.
Slicing also works as before.
Membership in and for iteration also work on tuples.
len, min, max, greater than (>), less than (<), sum and others work the same
way. In particular, any comparison operation has the same restrictions for
mixed types.
None of the operations that change lists are available for tuples. For
example, append, extend, insert, remove, pop, reverse, and sort do not work on
tuples. Here is a session demonstrating the various operators working with tuples.
But this flexibility comes at a cost: to allow these flexible types, each item in the
list must contain its own type info, reference count, and other information–that is,
each item is a complete Python object. In the special case that all variables are of
the same type, much of this information is redundant: it can be much more
efficient to store data in a fixed-type array.
Ufuncs exist in two flavors: unary ufuncs, which operate on a single input,
and binary ufuncs, which operate on two inputs. We'll see examples of both these
types of functions here.
Numpy
Each array has attributes ndim (the number of dimensions), shape (the size of each dimension),
and size (the total size of the array):
Another useful attribute is the dtype, the data type of the array
Other attributes include itemsize, which lists the size (in bytes) of each array element, and nbytes,
which lists the total size (in bytes) of the array
One important–and extremely useful–thing to know about array slices is that they
return views rather than copies of the array data
Panda:
A Pandas Series is a one-dimensional array of indexed data. It can be created from a list or array
as follows:
11. Introduction to Combining Datasets: Data analysis often involves combining different
data sources, from simple concatenation to more complex joins and merges. Pandas
provides functions and methods to facilitate this process.
12. Simple Concatenation with pd.concat(): The pd.concat() function is similar to NumPy's
np.concatenate() but designed for Series and DataFrames. It allows concatenation along
specified axes and handles missing data efficiently.
13. Concatenating Series and DataFrames: Examples show how to concatenate Series and
DataFrames using pd.concat(). Data can be concatenated row-wise (default) or column-
wise by specifying the axis parameter.
14. Handling Duplicate Indices: pd.concat() preserves indices by default, even if they are
duplicates. Options like verify_integrity and ignore_index allow handling of duplicate
indices.
15. Adding MultiIndex Keys: The keys parameter in pd.concat() adds hierarchical indexing to
the resulting DataFrame, useful for identifying the source of data.
16. Concatenation with Joins: pd.concat() offers options for handling columns with different
names (join='inner' for intersection, join_axes to specify resulting columns).
17. The append() Method: Series and DataFrames have an append() method that
concatenates objects along a particular axis. It's a simpler alternative to pd.concat() but
creates a new object each time.
18. Efficiency Considerations: While append() is convenient, it's less efficient than pd.concat()
when dealing with multiple concatenations, as it involves creating new indices and data
buffers.