Se ha denunciado esta presentación.
Se está descargando tu SlideShare. ×
Próximo SlideShare
Introduction to pandas
Introduction to pandas
Cargando en…3

Eche un vistazo a continuación

1 de 119 Anuncio

Más Contenido Relacionado

Presentaciones para usted (20)

Similares a Python Pandas (20)


Más reciente (20)


Python Pandas

  1. 1. Pandas Library Lets play with Tabular Data 6/1/2020 1
  2. 2. What is Pandas ?  Pandas is open source.  BSD- licensed Python library providing high – Performance.  Easy to use for data structures and data analysis.  Pandas use for different types of data. o Tabular data with heterogeneously-typed columns. o Ordered and unordered time series data. o Arbitrary matrix data with row & column labels o Unlabeled data o Any other form of observational or statistical data sets 6/1/2020 2
  3. 3. Pandas is Data Frame 6/1/2020 3 max std filter min count
  4. 4. Features of pandas  It has a DataFrame Object with customized indexing.  We can reshape data sets.  We can perform aggregations and Transformations on the data sets  It is used for data alignment and integration of the missing data.  It provides the functionality of Time Series.  Process a variety of data sets in different formats like matrix data, tabular heterogeneous, time series.  Handle multiple operations of the data sets such as subsetting, slicing, filtering, groupBy, re-ordering, and re-shaping.  It integrates with the other libraries such as SciPy, and scikit-learn.  Provides fast performance, and If you want to speed it, even more, you can use the Cython. 6/1/2020 4
  5. 5. Usage of Pandas  Modeling  Visualization  High Performance Computing  Big Data  Statistical Computing  Numerical Computing  Data Mining  Text Processing  Computing Language  Educational Outreach  Computational thinking 6/1/2020 5
  6. 6. How to install pandas  You can run following command to install pandas. o pip install pandas o OR o conda install pandas  After installation you can check version of pandas by running the following command at python script mode. o Import pandas o pandas.__version__ 6/1/2020 6
  7. 7. Data Structures There are 2 data structures in pandas. o Series.(One D array) o DataFrame 6/1/2020 7
  8. 8. Series Data Structure Series is a one –dimensional array. It contain homogeneous data. Size is fixed. 10 20 30 40 50 60 70 6/1/2020 8
  9. 9. Series Data Structure Cont. Series is One Dimensional array to hold different types of values. o pandas.Series( data, index, dtype, copy) data: It can be any list, dictionary, or scalar value. index: The value of the index should be unique and hashable. It must be of the same length as data. If we do not pass any index, default np.arrange(n) will be used. dtype: It refers to the data type of series. copy: It is used for copying the data. 6/1/2020 9
  10. 10. Creating a Series We can create Series by using various inputs: o Array o Dictonary o Scalar value 6/1/2020 10
  11. 11. Using Array import pandas as pd import numpy as np info = np.array(['R','a','y','s','T','e','c', 'h']) a = pd.Series(info) print(a) 6/1/2020 11
  12. 12. Using Dictionary and scalar Value Using dictionary o import pandas as pd o info = {1:'Rays',2:'Tech'} o a = pd.Series(info) o print(a) Using Scalar Value o import pandas as pd a = pd.Series(4,index=[1,2,3,4,5]) print(a) 6/1/2020 12
  13. 13. Series Object attribute:  import pandas as pd x = pd.Series([1,2,3],index = ['a','b','c'])  # It returns a tuple of shape of the data. print("Shape:",x.shape)  # It returns the size of the data. print("Size:",x.size)  # Defines the index of the Series. print("Index:",x.index)  # It returns the number of dimensions in the data. print("Dimension:",x.ndim)  # It returns the values of series as list print("values:",x.values)  6/1/2020 13
  14. 14. Series Object attribute (cont.)  # It returns the data type of the data. print("Data Type:",x.dtype)  # It returns the number of bytes in the data print("Bytes:",x.nbytes)  # It returns True if Series object is empty, otherwise returns false. print("is empty:",x.empty)  #It returns True if there are any NaN values, otherwise returns false. print("is empty:",x.hasnans) 6/1/2020 14
  15. 15. Pandas Operations map() o import pandas as pd o import numpy as np o a = pd.Series(['Java', 'C', 'C++', np.nan]) o print({'Java':'Core',"C":'Python'})) std() o import numpy as np o print("Series:",np.std([1,2,3,4,5])) count() o import pandas as pd o import numpy as np o i = pd.Series([2, 1, 1, np.nan, 3,4,5,5]) o print(i.count()) 6/1/2020 15
  16. 16. Pandas Operations filter() o import numpy as np o import pandas as pd o arr=np.array(([20,30,40],[50,70,80])) o df=pd.DataFrame(arr,index=["Vijay","Jay"],columns= ["Physics","Chemistry","Maths"]) o print(df) o print(df.filter(items=["Physics"])) 6/1/2020 16
  17. 17. Data Frame & Panel Data Structure  DataFrame is a two–dimensional array.  It contain heterogeneous data.  Size is fixed.  Data Type of Columns.  Panel is three-dimensional array.  It is hard to represent the Panel In graphical representation.  But panel is consider as a Data Frame. NAME AGE ADDRESS Rahul 22 Indore Ram 35 Ayodhya Krishna 25 Mathura COLUMN NAME TYPE NAME String AGE Integer ADDRESS String 6/1/2020 17
  18. 18. Creating DataFrame  # importing the pandas library import pandas as pd #empty DataFrame df = pd.DataFrame() print (df) # Using List list=["Rays","Tech","SunilOs"] print(pd.DataFrame(list)) # using Dict info = {'ID' :[101, 102, 103],'Department' :['B.Sc','B.Tech','M.Tech',]} print(pd.DataFrame(info)) 6/1/2020 18
  19. 19. Creating DataFrame #Using Dict of series info = {'one' : pd.Series([1, 2, 3, 4, 5, 6], index=['a', 'b', 'c', 'd', 'e', 'f']), 'two' : pd.Series([1, 2, 3, 4, 5, 6, 7, 8], index=['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h'])} d1 = pd.DataFrame(info) print(d1) 6/1/2020 19 one two a 1.0 1 b 2.0 2 c 3.0 3 d 4.0 4 e 5.0 5 f 6.0 6 g NaN 7 h NaN 8
  20. 20. DataFrame Implementation import pandas as pd a list of strings x = ['Rays','Tech',"SunilOs"] Calling DataFrame constructor on list df = pd.DataFrame(x) print(df) 6/1/2020 20
  21. 21. Column Selection from a DataFrame:  from pandas import DataFrame,Series  x={'Id':Series([1,2,3,4,5]),  "Name":Series(["A","B","C","D","E"])}  df=DataFrame(x)  print(df)  print(df['Id'])  print(df["Name"]) 6/1/2020 21
  22. 22. Column Addition and Deletion to DataFrame  from pandas import DataFrame,Series x={'Id':Series([1,2,3,4,5]),"Name":Series(["A" ,"B","C","D","E"])} df=DataFrame(x) print(df)  # ADD a Column from DataFrame df['Subject']=Series(["C","C++","Java","python ","Java","Julia"])  del df["Id"] # Delete a Column from DataFrame Using del function print(df)  df.pop("Name") # Delete a Column from DataFrame Using pop function 6/1/2020 22
  23. 23. Row Addition from DataFrame  from pandas import DataFrame,Series  x={'Id':Series([1,2,3,4,5],index=['a','b','c','d ','e']), "Name":Series(["A","B","C"],index=['a','b','c']) }  df=DataFrame(x)  d=DataFrame({"Subject":Series([1,2,3],index=['a' ,'b','c'])})  print(df.append(d)) #OR  d=DataFrame([[1,"Ajay","Python"]], columns=["Id","Name","Subject"])  print(df.append(d)) 6/1/2020 23
  24. 24. Row Deletion from DataFrame: from pandas import DataFrame,Series x={'Id':Series([1,2,3,4,5],index=['a','b ','c','d','e']), "Name":Series(["A","B","C"],index=['a',' b','c'])} df=DataFrame(x) print(df.drop('a')) 6/1/2020 24
  25. 25. Row Selection from DataFrame  from pandas import DataFrame,Series  x={'Id':Series([1,2,3,4,5],index=['a','b','c ','d','e']), "Name":Series(["A","B","C"],index=['a','b',' c'])}  df=DataFrame(x) #by row label  print(df.loc['a']) #by using row integer row location  print(df.iloc[4]) #by using slice operator : print(df[0:3]) 6/1/2020 25
  26. 26. DataFrame Object Methods: append():Add the rows of other dataframe to the end of the given dataframe. apply():Allows the user to pass a function and apply it to every single value of the Pandas series. count(): Count the number of non-NA cells for each column or row. describe():Calculate some statistical data like percentile, mean and std of the numerical values of the Series or DataFrame. 6/1/2020 26
  27. 27. DataFrame Object Methods: (cont.)  drop_duplicate(): Remove duplicate values from the DataFrame.  head():Returns the first n rows for the object based on position.  mean(): Return the mean of the values for the requested axis.  merge():Merge the two datasets together into one.  pivot_table():Aggregate data with calculations such as Sum, Count, Average, Max, and Min.  query():Filter the dataframe.  Sort_index(): Sort the dataframe.  sum(): Return the sum of the values for the requested axis by the user.  to_excel(): Export the dataframe to the excel file.  transpose(): Transpose the index and columns of the dataframe. 6/1/2020 27
  28. 28. Importing Data: pd.read_csv(filename) : It read the data from CSV file. pd.read_table(filename) : It is used to read the data from delimited text file. pd.read_excel(filename) : It read the data from an Excel file. pd.read_sql(query,connection _object) : It read the data from a SQL table/database. pd.read_json(json _string) : It read the data from a JSON formatted string, URL or file. 6/1/2020 28
  29. 29. Time-Series in Panda  Time-series data is a data with equal no of intervals such as daily, monthly, yearly.  Data is collected on a fraction of time o import pandas as pd o from datetime import datetime o import numpy as np o range_date = pd.date_range(start ='1/1/2019', end ='1/08/2019', freq ='Min') o print(range_date) o print(len(range_date)) 6/1/2020 29
  30. 30. Panda Date Time  import pandas as pd  dmy = pd.date_range('2017-06- 04', periods=5, freq='S')  We can Format the String to date o import pandas as pd o import datetime o print(pd.to_datetime('18000706', format='%Y%m%d', errors='ignore') ) o print(datetime.datetime(1800, 7, 6, 0, 0) ) o print(pd.to_datetime('18000706', format='%Y%m%d', errors='coerce') )  6/1/2020 30
  31. 31. Pandas time offset It performs various operation on time i.e. adding and subtracting o import pandas as pd o # Create the Timestamp o p = pd.Timestamp('2018-12-12 06:25:18') o # Create the DateOffset o do = pd.tseries.offsets.DateOffset(n = 2) o # Add the dateoffset to given timestamp o new_timestamp = p + do o # Print updated timestamp o print(new_timestamp) 6/1/2020 31
  32. 32. Pandas Time Period It represents the time period i.e. day, month, quarter, year. It converts frequency to time period. o import pandas as pd o daily=pd.period_range("2000","2010",freq=' D') o print(daily) 6/1/2020 32
  33. 33. Image Processing 6/1/2020 33
  34. 34. Image Processing Image Processing is a process to perform some operation on the image such as: o Change color of the image o Resize image o Crop the image etc. There are many libraries available for image processing. Some common libraries are following: we will Use OpenCv for Image Processing. 6/1/2020 34
  35. 35. Image Processing Libraries  Scikit-image: o It uses a Numpy array as an image object. Skimage transforms the image file into n-dimensional array objects. The ndarray can be type of integer and float. It is fast because written in c  OpenCV: o It is written in C++ and Python. Opencv can work with Numpy, scipy and matplotlib. It is used in Image processing, face detection, Object detection etc.  Mahotas: o It uses advanced features such as image processing. It can process 2D and 3D images. It can extract information from images. It has more than 100 functions for computer vision processing. 6/1/2020 35
  36. 36. Image Processing Libraries  SimpleiTk: o It treats images as set of points in a space. It support many dimensions such as 2D, 3D and 4D  SciPy: o Scipy has a module named scipy.ndimage for image processing. Scipy basically used for scientific calculations  Pillow: o It support many image formats such. It is advanced version of PIL.  Matplotlib: o It is used for visualization or graph plotting, but we can use it for image processing. It does not support all image file formats. It can used for image altering 6/1/2020 36
  37. 37. What is OpenCV? OpenCV is a Python open-source library, which is used for computer vision in Artificial intelligence, Machine Learning, face recognition, etc.  6/1/2020 37
  38. 38. How OpenCV Works: Human eye provides information whatever he sees. But the machine knows only numbers. The Machine converts images into numbers on the basis of pixel data and stores them in memory. The picture intensity represented as a number. 6/1/2020 38
  39. 39. How OpenCV Works: 6/1/2020 39
  40. 40. Representation of Image  Gray scale: black and white images  RGB: Colorful images 6/1/2020 40
  41. 41. Why is OpenCV used for Computer Vision? Free of cost It is fast. Requires less. It is portable. 6/1/2020 41
  42. 42. How to install OpenCv: Open command prompt and type o pip install opencv-python Open conda prompt o conda install -c conda-forge opencv in anaconda prompt  6/1/2020 42
  43. 43. Supported File Format Window bitmaps - *.bmp, *.dib JPEG files - *.jpeg, *.jpg, *.jpe Portable Network Graphics - *.png Portable image format- *.pbm, *.pgm, *.ppm TIFF files - *.tiff, *.tif 6/1/2020 43
  44. 44. Reading and writing Images  import cv2  # read a file as grayscale  img=cv2.imread("Desert.jpg",0)  cv2.imshow("Image",img)  cv2.waitKey(0)  # write to a file  res=cv2.imwrite("copy.jpg",img)  print("status:",res) 6/1/2020 44
  45. 45. OpenCV Basic Operation on Images: Access pixel values and modify them Access Image Properties Setting Region of Image Splitting and merging images Change the image color 6/1/2020 45
  46. 46. Accessing Image Properties import cv2 img = cv2.imread("Desert.jpg",1) print("Height:",img.shape[0]) print("Width:",img.shape[1]) print("channel:",img.shape[2]) print("size:",img.size) 6/1/2020 46
  47. 47. Making Borders for Images:  import cv2 as cv  BLUE = [255,0,0]  img1 = cv.imread("Desert.jpg",1)  img =  cv.copyMakeBorder(img1,50,10,10,10,cv.BORDER _CONSTANT, value=BLUE) 6/1/2020 47
  48. 48. OpenCV Resize the image:  import cv2  img = cv2.imread("Desert.jpg", 1)  scale = 60  width = int(img.shape[1] * scale / 100)  height = int(img.shape[0] * scale / 100)  dim = (width, height)  print("Initial dimension ",img.shape)  # resize image  resized = cv2.resize(img, dim)  print('Resized Dimensions : ', resized.shape)  cv2.imshow("Resized image", resized)  cv2.waitKey(0)  cv2.destroyAllWindows() 6/1/2020 48
  49. 49. OpenCV Drawing Functions  o, center, radius, color[,thickness [, lineType[,shift]]])  cv2.rectangle(): o cv2.rectangle(img, pt1, pt2, color[, thickness[,lineType[,shift]]])  cv2.ellipse(): o cv2.ellipse(img, center, axes, angle, startAngle, endAngle, color[, thickness[, lineType[, shift]]])  cv2.ellipse: o Cv2.ellipse(img, box, color[, thickness[, lineType]])  cv2.line: o cv2.line(img, pt1, pt2, color[, thickness[,lineType[, shift]]])  Put text: o cv2.putText(img, text, org, font, color)  To Draw Polylines: o cv2.polyLine(img, polys, is_closed, color, thickness=1, lineType=8, shift=0) 6/1/2020 49
  50. 50. Drawing some shapes on image  import numpy as np  import cv2  img = cv2.imread("Desert.jpg",1)  cv2.rectangle(img,(15,25),(200,150),(0,255,5 5),15)  cv2.imshow('image',img)  cv2.waitKey(0)  cv2.destroyAllWindows() 6/1/2020 50
  51. 51. Write Text to Image:  import cv2  font = cv2.FONT_HERSHEY_SIMPLEX  # Create a black image.  img = cv2.imread("Desert.jpg",1)  cv2.putText(img,'Rays Technology',(10,50),font,2,(100,255,255),2)  #Display the image  cv2.imshow("image",img)  cv2.waitKey(0) 6/1/2020 51
  52. 52. Crop the Image import cv2 img=cv2.imread("Desert.jpg") y=300 x=200 h=100 w=200 crop = img[y:y+h, x:x+w] # crop=img[300,500] cv2.imshow("Image:",crop) cv2.waitKey(0) 6/1/2020 52
  53. 53. Crop rectangle image to circle  import cv2  img=cv2.imread(“Koala.jpg")  img1=cv2.resize(img,(300,300))  h,w,c=img1.shape  print(h,w,c)  center=(w//2,h//2)  radius=180  color=(255,255,255)#white  thickness=80 ,center,radius,color,thickness)  cv2.imshow("Cropped",img1)  cv2.waitKey(0)  cv2.imwrite("Koala1.jpg",img1) 6/1/2020 53
  54. 54. OpenCV Resize the image:  import cv2  img = cv2.imread("Desert.jpg", 1)  scale = 60  width = int(img.shape[1] * scale / 100)  height = int(img.shape[0] * scale / 100)  dim = (width, height)  print("Initial dimension ",img.shape)  # resize image  resized = cv2.resize(img, dim)  print('Resized Dimensions : ', resized.shape)  cv2.imshow("Resized image", resized)  cv2.waitKey(0)  cv2.destroyAllWindows() 6/1/2020 54
  55. 55. Making Borders for Images:  import cv2 as cv  BLUE = [255,0,0]  img1 = cv.imread("Desert.jpg",1)  replicate =cv.copyMakeBorder(img1,50,10,10,10,cv.BORDE R_CONSTANT, value=BLUE)  cv.imshow("replicate",replicate)  Cv.waitKey(0) 6/1/2020 55
  56. 56. OpenCV Canny Edge Detection:  import cv2  img = cv2.imread('Koala.jpg')  edges = cv2.Canny(img, 100, 200)  cv2.imshow("Edge Detected Image", edges)  cv2.imshow("Original Image", img)  cv2.waitKey(0) # waits until a key is pressed  cv2.destroyAllWindows() 6/1/2020 56
  57. 57. Python Numpy The fundamental package for scientific computing with Python 6/1/2020 57
  58. 58. What is Numpy Numpy stands for numeric python. Numpy module is used to perform computation and processing on multidimensional and single dimensional arrays. 6/1/2020 58
  59. 59. Advantages Of NumPy It is used for array Oriented Computing. High efficiency in a multidimensional array. Use for scientific computation. We can reshape the data which is stored in multidimensional array NumPy provides the in-built functions for linear algebra and random number generation. 6/1/2020 59
  60. 60. NumPy Environment Setup NumPy doesn't come bundled with Python. We have to install it using the python pip installer. Execute the following command. o pip install numpy OR The Anaconda provides SciPy stack which is a free distribution of the Python SciPy package 6/1/2020 60
  61. 61. Numpy Ndarray Ndarray is an object to store similar kinds of elements. We can create a collection of elements of the same data type. The array index starts from 0. 6/1/2020 61
  62. 62. How to Create Ndarray  To create ndarray first import the numpy module and use array() function. o import numpy as np o a=np.array o print(a) o za=n.zeros((2,3),dtype=int) o print("Zeros array:n",za) o oa=n.ones(shape=(3,2)) o print("Ones arrayn",oa) o emp=n.empty(shape=(1,2),dtype=int) o print("empty array:n",emp) 6/1/2020 62
  63. 63. Array From existing data:  numpy.array(object, dtype = None, copy = True, order = None, subok = False, ndmin = 0)  object- it can be set, tuple dictionary etc.  dtype-it represents the type of the object. By default the type is none  copy- It means the object is copied. By default value is true. It is an optional attribute.  order- we can arrange arrays in 3 possible ways C(column order), R(row order), A(any).  subok- by default the base class is array, by making this attribute true we can make subclasses  ndim- To define the minimum dimension of the array. 6/1/2020 63
  64. 64. Array From list data:  import numpy as np  ar=np.array({1,2,3})#OR  ar=np.array([1,2,3],float)#OR  ar=np.array((1,2,3),complex)  l=[[5,6,7],[1,2,3],[8,9,7]]  la=n.asarray(l)  print("Array from List:n",la)  print(ar)  print(type(ar)) 6/1/2020 64
  65. 65. Array from tuple t=((2,2,3),(1,2,3)) a=n.asarray(t,dtype=float) print("Array From tuple:n",a) 6/1/2020 65
  66. 66. Ndarray Object attributes: ndim-ndim function is used to find the dimension of the given array. dtype- is used to get the type of the elements of the array shape- It is used to get the shape of the given array. Menas no of rows and columns in a array. size- is used to get the no of elements of the array of size of the array Itemsize- to find the size of each element in the array. 6/1/2020 66
  67. 67. Array Attribute import numpy as np a=np.array([[1,2,3],[2,3,4],[4,5,6]]) print(a) print("Dimension:",a.ndim) print("Data type:",a.dtype) print("Size of array:",a.size) print("Shape of Array:",a.shape) print("Size of elements:",a.itemsize) 6/1/2020 67
  68. 68. Restructure the shape of the ndarray: import numpy as np a=np.array([[1,2],[2,3],[5,6]]) print("Original array") print(a.shape) ra=a.reshape(2,3) print(ra) 6/1/2020 68
  69. 69. Slicing in the Array- To extract the range of elements from a array we can use slicing. It is same as we have used in the list import numpy as np a=np.array([1,2,3,4,5,6]) print(a[1:3]) 6/1/2020 69
  70. 70. Numpy array using numerical Range: Arange: The arange() function is used to create an array of evenly spaced over the given range. Linspace: The linspace() function is used to get evenly spaced elements over the given interval range. Logspace: lgapace() function is used to create an array of evenly spaced values according to log scale.  6/1/2020 70
  71. 71. Numpy array using numerical Range: import numpy as np a=np.linspace(1,20,5) print(a) a=n.arange(0,10,2,float) print("Ranged Array:n",a) l=n.logspace(2,10,5) print("LogSpaced Array:n",l) 6/1/2020 71
  72. 72. Aggregate Operations on array We can perform aggregate function on the array such as: sum(), max(), min() etc. from numpy import * a=array([1,2,3,4,5]) print(type(a)) print("Sum:",sum(a)) print("Max:",max(a)) print("Min:",min(a)) print("Std:",std(a)) print("Square Root:",sqrt(a)) 6/1/2020 72
  73. 73. Arithmetic Operations on array To perform arithmetic operations the shape of the array must be same  from numpy import *  a=array([[1,2,3],[2,3,4]])  b=array([[2,4,6],[4,6,8]])  print("Addition:",a+b)  print("Subtraction:",a-b)  print("Multiplication:",a*b)  print("Division:",a/b) 6/1/2020 73
  74. 74. Ndarray Axis  We can perform operations row and column wise on array.  from numpy import * a=array([[1,2,3],[1,3,4]]) print(a) print("Max:",a.max(axis=1)) print("Max:",a.min(axis=1)) print("Max:",a.sum(axis=1)) 6/1/2020 74
  75. 75. Array concatenation we can concat multidimensional arrays vertically and horizontally from numpy import * a=array([[1,2,3],[1,3,4]]) b=array([[5,6,7],[8,9,10]]) print("Vertically concat:n",vstack((a,b))) print("Horizontally concat:n",hstack((a,b))) 6/1/2020 75
  76. 76. Iteration over the Array nditer() function is used to iterate through arrays. import numpy as n na=n.array([[1,2,3],[5,6,7]]) print("Array:n",na) for x in n.nditer(na):  print(x,end=" ") 6/1/2020 76
  77. 77. Numpy String Function Sr. no Function Description 1 add() It is used to add string type element to array 2 multiply() It is used to obtain the multiplied copy of the string element 3 center() It is used to make a center aligned string filled with specified no character. 4 capitalize() It returns the copy of the string with the first letter capitalized. 5 title() It returns the title case string. 6 lower() It returns a string with lowercase letters . 7 upper() It returns a string with uppercase letters. 8 split() It returns the words from string. 9 splitlines() It breaks the list of lines 10 strip() It removes the left and right spaces. 11 join() It joins the given sequence of strings. 12 replace() It replaces all the occurring words of substring in a string. 13 decode() It is used to decode the string in specified format. 14 encode() It is used to encode a string into a specified codec. 6/1/2020 77
  78. 78. Numpy String Function(cont.)  import numpy as np  str=["Vijay","Ajay","Sanjay"]  str1=[" Chouhan"," Sharma"," singh"]  print("Original array:n",str)  print("Add:n",np.char.add(str,str1))  print("Multiply:n",np.char.multiply(str,3))  print("Center:n",,6))  print("Strip:n",np.char.strip(str))  enstr=np.char.encode("Welcome","cp500")  print("Encode:",enstr)  print("Decode:",np.char.decode(enstr,"cp500" )) 6/1/2020 78
  79. 79. Numpy Mathematical Function: Sr. No Function Description 1 sin() Used to calculate sin of given angles 2 cos() Used to calculate cosine of given angles 3 tan() Used to calculate tangent of given angles 4 arcsin() Used to calculate inverse of sin of given angles 5 arccos() Used to calculate inverse of cosine of given angles 6 arctan() Used to calculate inverse of tangent of given angles 7 degree() It is used to convert the angles in degree 8 around() It is used to round off the value according to the given number. 9 ceil() It is used to calculate the ceil value of the given no 10 floor() It is used to calculate the floor value of the given no. 6/1/2020 79
  80. 80. Numpy Function: Sr. No Function Description 1. array() To create a n dimensional array 2. append() To add new values to existing array 3. concatenate() To concatenate arrays into a single array 4. reshape() It is used to change the shape of the existing array 5. sum() It is used to get the sum of the whole array or row and column wise sum of arrays. 6. random() It is used to get the array of random no according to the called function. 7 zeros() It returns the array containing zeros of given shape. 8. log() It is used to calculate log of each elements in the array 9. transpose() It is used to change the shape of the array 10 unique() It is used to get the unique elements from the array. 6/1/2020 80
  81. 81. Numpy Function: 11. std() It is used to calculate the standard deviation along the axis. 12 mean() It is used to calculate arithmetic mean to given axis 13. sort() It is used to sort the element of the given array. 14 average() It is used to calculate the weighted average of the array 15 arange() It is used to get an array according to given start, stop and steps 16 linspace() It is used to get the evenly spaced array according to given size. 17 trunc() It is used to truncate the value of the given array 18. asarray() It is used to convert the python sequence into numpy array 6/1/2020 81
  82. 82. Numpy Data Types- numpy provides the wide range of datatypes. We can create the dtype object as following: numpy.dtype(object, align , copy) o import numpy as np o d=np.dtype([('salary',np.float)]) o arr=np.array([(10000.12,),(20000.50,)],dtype=d) o print(arr['salary']) 6/1/2020 82
  83. 83. Matplotlib 6/1/2020 83
  84. 84. Matplotlib It is a python library developed by John D. Hunter in 2002.  Latest release is 3.1.1 released in 2019. It is used for data visualization on multi platforms It can be used in python script, shell, web application, and other python gui toolkits. 6/1/2020 84
  85. 85. Matplotlib Architecture  There are three different layers in the architecture of the matplotlib which are the following:  Backend Layer- This is the bottom layer which is responsible for implementation of the function which is required for plotting. There are 3 important classes- FigureCanvas(surface to draw figure) , Renderer(responsible for drawing on surface), Event(handles mouse and key events).  Artist layer - This is the middle layer of the architecture. It takes care of all the plotting function  Scripting layer- This is the top most layer where all the code runs. 6/1/2020 85
  86. 86. Different part of the Matplotlib: 6/1/2020 86
  87. 87. Different part of the Matplotlib: Figure: It is a canvas that holds plots. Axes: a figure can contain several plots. Axes is a plot. Axis: axis represents the limit of the plots.. Artist: artist are all visual objects which are shown on the figure such as text objects, lines and collection. 6/1/2020 87
  88. 88. Environment setup: If you are installed in anaconda distribution then matplotlib is pre-installed in the anaconda distribution no further installation required. Or we can install using pip command as following: o Pip install matplotlib To know installed version: o import matplotlib o matplotlib.__version__ 6/1/2020 88
  89. 89. Types of Graphs: We can plot different kinds of graphs in matplotlib with different styles. Some of the graph as following: Bar graph Scatter graph Line Graph Pie Charts Histograms Box plot 6/1/2020 89
  90. 90. Some Important Functions in Matplotlib: Sr. no. Function Description 1 axes() T is used to add axes(plot) to a given figure. 2 bar(values, color) It is used to create the horizontal bar graph with given values and color 3 barh(values, color) It is used to create horizontal bar graphs with given color and values. 4 figure() It is used to create a figure class instance, and handle figure level attribute 5 grid(bool) It is used to show the grids on the plots 6 hist(values, bin) It is used to create a histogram 7 legend() It is used to show the legend on the axis which describes the elements of the graph 8 pie(values,categorical values) It is used to create a pie chart. 9 plot(x values, y values) It is used to plot the simple line graph with given x and y axis values 10 plot3D(x,y,z) It is used to draw a 3D graph with given axis values. 6/1/2020 90
  91. 91. Some Important Functions in Matplotlib: 11 subplot(row,col,index) It is used to create a subplot for a given figure. 12 suptitle(str) It is used to set the common title for all the plots 13 scatter(x values, y values) It is used to create a scatter plot. 14 set_xlabel(str) This is the axes level method.It is used to give the label to the x axis. 15 set_ylabel(str) This is the axes level method.It is used to give the label to the y axis. 16 scatter3D(x values, y, value, z values) It is used to draw a scatter graph in 3 dimension space 17 savefig(loc) It is used to save the plotted graph to the specified Location 18 show() It is used to makeplotted graph visible 19 title(str) It is used to set the title of the plot 20 text() It is used to set the text to the plot 21 xlabel(str) It is used to set the label of the x axis of a plot 22 ylabel(str) It is used to set the label text of a y axis of a plot 6/1/2020 91
  92. 92. Simple graph with matplotlib import matplotlib.pyplot as plt #plotting our canvas plt.plot([1,2,3],[4,5,1]) #display the graph  6/1/2020 92
  93. 93. Add title and label to the axis import matplotlib.pyplot as plt x = [3, 4, 8] y = [5, 12, 2] plt.plot(x, y) plt.title('Line graph') plt.ylabel('Y axis') plt.xlabel('X axis')  6/1/2020 93
  94. 94. Changing Color and style of the line in a graph Matplotlib support following abbreviations for color: b (blue), r (Red), g (Green), c (Cyan), m (Magenta), y(Yellow),k (Black), w(white) ‘b’- to get a blue solid line with default shape. ‘ro’- Red color circle ‘-g’ - Green solid line ‘--’ - dashed line with default color. ‘^k:’ - Black triangle connected by dotted lines. ‘d:’ - Diamond shapes with default color, connected by dashed line 6/1/2020 94
  95. 95.  import matplotlib.pyplot as plt  plt.title("Line Graph")  plt.ylabel("Y axis")  plt.xlabel("X axis")  x=[1,2,3,4,5]  y=[2,4,6,8,10]  #data points with red color and square Shapes  plt.plot(x,y,"rs")  6/1/2020 95
  96. 96. Line Graph  import matplotlib.pyplot as plt plt.title("Line Graph") plt.plot([1,4,5],[5,3,2],'r',linewidth=5, label="Line1") plt.plot([1,5,3],[6,4,9],'k',linewidth=5, label="Line2") plt.legend() plt.grid(True) 6/1/2020 96
  97. 97. fill_between():  import matplotlib.pyplot as plt import numpy as np x = np.arange(0.0, 2, 0.01) # x = np.arange(1, 5, 1) y1 = np.sin(2 * np.pi * x) y2 = 1.2 * np.sin(4 * np.pi * x) fig, ax = plt.subplots(1, sharex=True) ax.plot(x, y1, x, y2, color='black') ax.fill_between(x, y1, y2, where=y2 >= y1, facecolor='blue') ax.fill_between(x, y1, y2, where=y2 <= y1, facecolor='red') ax.set_title('fill between where') 6/1/2020 97
  98. 98. Output 6/1/2020 98
  99. 99. Plotting More Than one Graph in a single Figure: Matplotlib provides the function to plot more than one graph in a single figure. subplot() subplots() subplot2grid() 6/1/2020 99
  100. 100. subplot() Example:  import matplotlib.pyplot as plt  x=["Vijay","Jay","Ajay"]  y=[90,80,70]  plt.subplot(131) ,y)  plt.subplot(132)  plt.scatter(x,y)  plt.subplot(133)  plt.plot(x,y,"^k")  6/1/2020 100
  101. 101. subplot 6/1/2020 101
  102. 102. subplots():  this function provides a grid like structure to plot the subplots: o plt.subplots(nrow,ncols) o from matplotlib import pyplot as plt o fig,a=plt.subplots(2,2) o x=[1,4,3] o y=[5,6,7] o import numpy as np o a[0][0].set_title("Line") o a[0][0].plot(x,y) o a[0][1].set_title("Sin") o a[0][1].plot(x,np.sin(x)) o a[1][0].set_title("Cosine") o a[1][0].plot(x,np.cos(x)) o a[1][1].set_title("tangent") o a[1][1].plot(x,np.tan(x)) o plt.savefig("subplots.png") o 6/1/2020 102
  103. 103. subplots 6/1/2020 103
  104. 104. subplot2grid():  with the help of this function we can vary the size of the plots. o Plt.subplot2grid(shape, location, rowspan, colspan) 6/1/2020 104
  105. 105. subplot2grid() implementation  import matplotlib.pyplot as plt a1 = plt.subplot2grid((3,3),(0,0),colspan = 2) a2 = plt.subplot2grid((3,3),(0,2), rowspan = 3) a3 = plt.subplot2grid((3,3),(1,0), colspan = 2) a4 = plt.subplot2grid((3,3),(2,0),colspan = 2) import numpy as np x = np.arange(1,10) a2.plot(x, x*x) a2.set_title('square') a1.plot(x, np.exp(x)) a1.set_title('exp') a3.plot(x, np.log(x)) a3.set_title('log') a4.plot(x, np.sin(x)) a4.set_title('sign') plt.tight_layout() 6/1/2020 105
  106. 106. Bar Graph:  Bar graphs are mostly used to show association among the data. The matplotlib provides a bar() function to draw a bar graph. We have to pass the categorical variable , values and color.  from matplotlib import pyplot as plt x=["Ram","Shyam","Ghanshyam"] y=[30,90,70] plt.title("Bar Graph") plt.xlabel("Name") plt.ylabel("Marks") plt.grid(True),y,color="cyan") 6/1/2020 106
  107. 107. Horizontal Bar Graph  barh(): this function is used to draw a horizontal graph:  from matplotlib import pyplot as plt x=["Ram","Shyam","Ghanshyam"] y=[30,90,70] plt.title("Bar Graph") plt.xlabel("Marks") plt.ylabel("Name") plt.grid(True) #,y,color="cyan") plt.barh(x,y,color="Black") 6/1/2020 107
  108. 108. Pie Chart:  Pie chart is a circular graph. It is used to show the percentage of data with the help of slices or segments of the circle.  from matplotlib import pyplot as plt x=["Jay","Vijay","Sanjay"] y=[12,27,60] #distance from center ex=[0.1,0,0] colors=["Gold","Silver","cyan"] plt.pie(y,ex,labels=x,colors=colors,autopct='%1. 1f%%',shadow=True,startangle=90) plt.title("Pie Chart") plt.legend(loc="lower left") 6/1/2020 108
  109. 109. Pie Chart (cont.) 6/1/2020 109
  110. 110. Scatter Plot:  It is Used to compare variables. The data is displayed as a points.  from matplotlib import pyplot as plt x = [5,7,10] y = [18,10,6] x2 = [6,9,11] y2 = [7,14,17] plt.scatter(x, y) plt.scatter(x2, y2, color='g') plt.title('Info') plt.ylabel('Y axis') plt.xlabel('X axis') plt.grid(True) 6/1/2020 110
  111. 111. Histogram Charts:  There is a difference between the histogram and bar graph  Histogram is Used to show distribution of data, bar graph is used to show comparison of different entities.  A histogram is a type of bar plot that shows the frequency of a number of values compared to a set of values ranges.  There is four histtype available: o bar-by default o barstaked- multiple data stacked on top of each other o step- line plot by default unfilled o stepfilled- line plot with filled area.  6/1/2020 111
  112. 112. Histogram implemenatation  from matplotlib import pyplot as plt # from matplotlib import pyplot as plt Marks = [21,53,60,49,25,27,30,42,40,1,2,102,95,8,15, 105,70,65,55,70,75,60,52,44,43,42,45] bins = [0,10,20,30,40,50,60,70,80,90,100] plt.hist(Marks, bins, histtype='bar', rwidth=0.8) plt.xlabel('age groups') plt.ylabel('Number of people') plt.title('Histogram') 6/1/2020 112
  113. 113. 3D graph plot: Firstly the matplotlib package is developed to draw 2D graphs. But after 1.0 version to draw 3D graphs all the utilities are added above the 2D sets. Three-dimension plots can be created by importing the mplot3d toolkit, include with the main Matplotlib installation: o from mpl_toolkits import mplot3d 6/1/2020 113
  114. 114. 3D graph (cont.)  from mpl_toolkits import mplot3d import numpy as np import matplotlib.pyplot as plt fig = plt.figure() ax = plt.axes(projection='3d') x=np.arange(10) y=np.sin(x) z=np.cos(x) ax.plot(x,y,z) ax.set_xlabel("X axis") ax.set_ylabel("Y axis") ax.set_zlabel("Z axis") print(x) 6/1/2020 114
  115. 115. Output 6/1/2020 115
  116. 116. Working with text: We can use a subset TeXmarkup in any Matplotlib text string by placing it inside a pair of dollar signs ($). To make subscripts and superscripts, use the '_' and '^' symbols −  6/1/2020 116
  117. 117. Working with text (cont.)  from matplotlib import pyplot as plt  plt.title("How to add Text ")  plt.text(0.29,0.18,"$E=MC^2$", color=“blue", fontsize=50, style="italic", bbox = {'facecolor': 'red'})  6/1/2020 117
  118. 118. Disclaimer This is an educational presentation to enhance the skill of computer science students. This presentation is available for free to computer science students. Some internet images from different URLs are used in this presentation to simplify technical examples and correlate examples with the real world. We are grateful to owners of these URLs and pictures. 6/1/2020 118
  119. 119. Thank You! 6/1/2020 119

Notas del editor