Scope - The primary focus of this presentation is how to leverage open source software to help in managing Shared Storage performance. The storage server will be the focus with particular emphasis on ESS. This solution is a small one-off solution.
5. What is a SAN? ISL’s Core Switch - A Core Switch - B Fabric Edge Switch - A Edge Switch - B Links Links Storage Switch - A Storage Switch - B Servers Storage Servers
6. What Can We Measure on the Attached Server? Server View SAN Storage Fabric LUN PATH A PATH B Read Kbytes/sec, Write Kbytes/sec, I/Os per second, Reads/sec, Writes/sec, End-to-End Response Time Physical Volume Read Kbytes/sec, Write Kbytes/sec, I/Os per second, Reads/sec, Writes/sec Virtual Path/LUN Read Kbytes/sec, Write Kbytes/sec, I/Os per second Adapter Metrics Component HBA HBA
7. What SAN Fabric Components Can We Measure? Fabric Edge Switch - A Edge Switch - B ISL’s Core Switch - A Core Switch - B Links Links Storage Switch - A Storage Switch - B What can be measured?
8. What Can We Measure on the Storage Server? KB/sec, RT Physical NVS Delays Cache Hits Logical Volume: Reads, Writes, Sequential I/Os, KB/sec, I/O Time Physical
9.
10. What is the Solution? See Appendix H for Requirements sar, iostat, filemon MRTG SNMP TSE, DB2, PERL Collect Post Process PERL, PHP PERL, PHP Extract/Show MYSQL Store RRDTOOL MYSQL Mixed OSS Legend OSS+Glue Server ESS Switch Apache, PHP Browser
21. Storage Server – Chart Array Exceptions Based on the exception table in the previous slide we can drill down by clicking on the exception and chart the exceptions
22.
23. Appendix A - Measure End-to-End Host Disk I/O Response Time The iostat package for Linux is only valid with a 2.4 & 2.6 kernel See Appendix B for links to more information Avg. Disk sec/Read Physical Disk perfmon NT/Wintel svctm (ms) iostat –d 2 5 *iostat Linux iostat –xcn 2 5 sar –d filemon -o /tmp/filemon.log -O all Command/Object iostat sar filemon Native Tool svc_t (ms) Solaris avserv (ms) HP-UX read time (ms) write time (ms) AIX Metric(s) OS
24. Appendix B: Getting LUN Serial Numbers for ESS Devices Note : ESS Utilities for AIX/HP-UX/Solaris are available at: http://www-1.ibm.com/servers/storage/support/disk/2105/downloading.html Host config. - http://www.redbooks.ibm.com/abstracts/tips0553.html Device Name LUN SN lsvpcfg SDD Linux SDD ESS Util Tool Device Name Serial Datapath query device Wintel VG, hostname, Connection, hdisk LUN SN lsvp –a AIX, HP-UX, Solaris Other Metrics Key Command OS
25.
26.
27.
28. Appendix F: DB2 Query for Array Performance Data Note : This information is relevant only if you have the TotalStorage Expert installed and access to the DB2 command line on the TSE server. SELECT DISTINCT A.*, B.M_CARD_NUM, B.M_LOOP_ID, B.M_GRP_NUM FROM DB2ADMIN.VPCRK A, DB2ADMIN.VPCFG B WHERE ( ( A.PC_DATE_B >= '%STARTDATE' AND A.PC_DATE_E <= '%ENDDATE' AND A.PC_TIME_B >= '%STARTTIME' AND A.PC_TIME_E <= '%ENDTIME' AND A.M_MACH_SN = '%ESSID' AND A.M_MACH_SN = B.M_MACH_SN AND A.M_ARRAY_ID = B.M_ARRAY_ID AND A.P_TASK = B.P_TASK ) ) ORDER BY A.M_ARRAY_ID, A.PC_DATE_B, A.PC_DATE_E with ur;
29. Appendix G: DB2 Query for Array Configuration Data Note : This information is relevant only if you have the TotalStorage Expert installed and access to the DB2 command line on the TSE server. SELECT DISTINCT A.M_MACH_SN, A.M_MODEL_N, A.M_CLUSTER_N, A.M_RAM, A.M_NVS, C.I_DDM_RPM, C.I_DDM_GB_CAPACITY FROM DB2ADMIN.VPVPD A, DB2ADMIN.VMPDX B, DB2ADMIN.VcMDDM C WHERE ( ( A.M_MACH_SN = B.I_VSM_SN AND B.I_VSM_IDX = C.I_VSM_IDX ) ) ORDER BY A.M_MACH_SN, A.M_CLUSTER_N;
30.
31.
32.
33. Biography Brett Allison has been doing distributed systems performance related work since 1997 including J2EE application analysis, UNIX/NT, and Storage technologies. His current role is Performance and Capacity Management team lead ITDS. He has developed tools, processes, and service offerings to support storage performance and capacity. He has spoken at a number of conferences and is the author of several White Papers on performance
Notas del editor
Scope - The primary focus of this presentation is how to leverage open source software to help in managing Shared Storage performance. The storage server will be the focus with particular emphasis on ESS. This solution is a small one-off solution.
“ Shared storage” typically refers to the storage shared on a SAN. This includes the s torage Area Network Switches and other Fabric components (ISL’s, routers, etc) We can measure many of the components in the SAN including but not limited to: Server HBA’s, Switch ports, and Storage Server I/O components. Link information includes Tput, packets/sec, errors
From the point of view of the server storage is allocated to the physical disks. These disks are accessed via Host Bus Adapters (HBA). Throughput statistics are available on most systems at the HBA level. If a multi-pathing software is implemented, the virtual path typically corresponds to the storage allocation unit on the storage server (LUN). Most servers provide throughput information to the virtual paths. When multi-pathing software is implemented, more than 1 host physical volume will point to the same virtual path. In addition to throughput information, most servers provide end-to-end response time for the physical volumes. This provides the ability to identify if 1 path is performing better than another path (ie – Fabric congestion). All of this information can be measured on most servers using native utilities. There is typically very little visibility into the storage server or network pathing from the server’s view point. In some cases vendor specific server based utilities can provide configuration information that can be used to summarize server based performance data for things like I/Os to a certain storage server component. In any case, these views do not provide visibility outside the server.
We can measure many of the components in the SAN including but not limited to: Server HBA’s, Switch ports, and Storage Server I/O components. The switch ports can provide information such as Kbytes/sec Throughput, packets/sec, and errors/sec
Kbytes/sec and response time for the Ports on the HA side are only available via the API, and the CLI. They are not available in the TSE, MDM, or TPC-DiskT The TSE has 2 performance tables: VPCRK (Array/DG) and VPCCH (LUN). Higher level components do not have measurement, however, you can roll up the data to higher levels such as ESS, Cluster, Adapter, and Loop. In addition to the raw data, several important fields are included in the array level data: Array Avg RT ms, Avg Disk Utilization, % Sequential, % Read The NVS and Cache hit stats are stored at a LUN level.
At this point I have not found a single OSS or Vendor Tool that provides all the necessary infromation to manage the performance of the SAN. There are a number of vendors that seem to have tools that will do this including IBM/Tivoli but either the tools are not compatible with our environment, or they are not ready able to provide all the features we require at this point.
Attached server configuration data is important and the commands used are dependent on the OS type and the Storage Server type. There are a number of options for monitoring ESS including TSE, MDM, TPC for Disk, CLI, API -> CIM agent/CIMOM and several 3 rd party products, and they all have their advantages and disadvantages. The TSE/MDM/TPC-Disk all offer similar data stored in DB2 tables. The CLI provides the information in a formatted report that would require significant formatting prior to analysis. The API is a potential option but I simply did not have the time to create my own collector. As a result, I am currently using a number of different tools to manage the performance of the environment. Collectively these tools provide the essentials for managing the environment. Open source is used in some cases but not all as seen in the slide. Requirements Monitor SAN components Utilize existing data sources: TotalStorage Expert for ESS EFCM log files for Fabric – SNMP in future with MRTG Native distributed server utilities for attached servers Monitor frequently enough to be useful Store data for historical purposes and trending Correlate disparate components
To collect mrtg you can run it as a daemon or run it from cron (Most of the standard linux distributions have mrtg) What type of Data: Port level – Octets in/out, uptime SNMP can be used to pull other information, wwpn, error counts, firmware level, switch rebooted. PERL::SNMP module OR NET::SNMP binary distribution contains SNMP library and PERL MIB module that ties into binary. The problem with PERL::SNMP it does not load the MIB modules CRICKET is another way to do this; http://cricket.sourceforge.net/
mrtgrrd is a CGI script that will query RRD files – takes about a seconds. Included in the set of contributed files. Easy to set up. Set a couple vars and point to the mrtg files to read RRDTool is OS specific so when moving between OS’s you must exported as XML Standard import script for XML formatted data to RRDTool is called: rrdtool dump, and rrdrestore
Not generating graphs automatically is a Good Thing – Processing time big time!
This slide provides a high level description of the key components required for collecting the data. As a side note, the queries used against the TSE db2 database are very similar to what would be ran against a TPC for Disk or MDM DB2 database. It is assumed that a directory structure is already configured. The script to execute the SQL query does 2 things: Takes a query template and replaces key parameters such as start and end date and time with the correct values, It then creates a small shell script that executes the SQL query The queries provided in the appendices listed above are used to gather array level configuration and performance data
Array level data is used because it provides physical measurements for the arrays. Using the configuration information gathered it can also provide a summary of ESS performance at various higher level components. These components include Cluster and Disk Adapter. This report should inlclude a calculated metric for scoring the health of the arrays. The VPCRK table where the array level data is extracted from provides some cluster level information: cache, nvs. These should be save in the Exception table as
Exception charts inlcude the array level and cluster level exceptions that were created and imported in the previous step from the array level data. The healthcheck reports should provide a high level summary of the health of ESS at both a server level, a component level (Cluster, Adapter, Array) as well as a summary of all ESS’s for a given customer. For instances of shared ESS’s, the customer does not reflect a single customer’s perspective. In this step it is necessary to define the data necessary to complete the report. The easiest way to do this is to design the report, and then map the data from the report to the data in the DB. During this step or before the next step you should define your SQL queries required for each report and any business logic required
Forms should provide a means for the user (you) to select the report Type: Healtcheck (Server Level), Rank Report, as well as the required parameters for the SQL statement that will pull the data required for the specific report: Parms include: Start Date, End Date, Start Time, End Time, ESS, Report Type
Forms should provide a means for the user (you) to select the report Type: Healtcheck (Server Level), Rank Report, as well as the required parameters for the SQL statement that will pull the data required for the specific report: Parms include: Start Date, End Date, Start Time, End Time, ESS, Report Type
Forms should provide a means for the user (you) to select the report Type: Healtcheck (Server Level), Rank Report, as well as the required parameters for the SQL statement that will pull the data required for the specific report: Parms include: Start Date, End Date, Start Time, End Time, ESS, Report Type
Forms should provide a means for the user (you) to select the report Type: Healtcheck (Server Level), Rank Report, as well as the required parameters for the SQL statement that will pull the data required for the specific report: Parms include: Start Date, End Date, Start Time, End Time, ESS, Report Type
Generally speaking the I/O response time is the amount of time it takes from the point where the I/O request hits the device driver until the I/O is returned from the device driver
For IBM’ers I have a sample script that I can make available. For external customers I would advise you to contact your local IBM AIX field reps to see if they have anything or roll your own script.
For IBM’ers I have a sample script that I can make available. For external customers I would advise you to contact your local IBM AIX field reps to see if they have anything or roll your own script.
For IBM’ers I have a sample script that I can make available. For external customers I would advise you to contact your local IBM AIX field reps to see if they have anything or roll your own script.