(No Ratings Yet)
Loading...
Capturing Solaris 11.4/(12) analytical data
Note: There is an updated process for Solaris 11.4+, please check out part 5 – Enhanced method on how to capture analytics in Solaris 11.4+ Note: The full code is available in a GitHub repository. to play with the code, feel free to just clone the Solaris Analytics Publisher repository. In the last post we prepared/configured Stats Store(sstore) with our own schema.below I am going to show you haw to capture and prepare the data to be able to populate in the Stats Store(sstore) format.- Please check out part 1 on how to configure analytics
- part 2 on how to configure the client capture stat process
- Please check out part 3 on how to publish the client captured stats.
- Please check out part 4 Configuring / Accessing the Web Dashboard / UI.
- Please check out part 5 Capturing Solaris 11.4 (12) Analytics By Using Remote Administration Daemon (RAD).
Configuring the clients to capture data
There are two parts to the client configuration.- The client capturing stats i.e. CPU, Memory, db, etc…
- The client making these stats available (remotely)
First I will show you how to collect OS stats
To capturing OS stats, I use a Python module called psutil, you can use any other form for data collection as outlined below.. Tip: Psutil is included in Solaris 11.4(12) but not in Solaris 11.
Lets install psutil (or update to latest in Solaris 11.4/12)
For Solaris 12, you can install psutil from the repo, just run the below.pkg install pkg://solaris/library/python/psutilNote: The below steps are for Solaris 11 If you dont have Internet access you cat download the psutil-4.x.tar.gz file from the Python modules web site.
Installing the Python module
Prerequisite required for the module to properly install. You will need the system header package, install by running the below..pkg install system/headerNote: In Solaris 11 you will also need cc, Download from Oracle CC (Oracle/Sun Studio).
Installing psutil
# For python 2.6 (sys default) export PATH=/mnt/usr/CC12_3/solarisstudio12.3/bin:$PATH pip install psutil-4.3.0.tar.gz # For python 2.7 (sys default) mkdir -p /ws/on11update-tools/SUNWspro ln -s /mnt/usr/CC12_3/solarisstudio12.3 /ws/on11update-tools/SUNWspro/sunstudio12.1 pip-2.7 install psutil-4.3.0.tar.gz umount /mnt
Next lets create the client capture program
Create directory
mkdir -p /opt/sys_monitor/conf /opt/sys_monitor/bin /opt/sys_monitor/db /opt/sys_monitor/services /opt/sys_monitor/startup /opt/sys_monitor/modules
I will explain each directory content
* /opt/sys_monitor/conf – contains DB related configuration scripts * /opt/sys_monitor/bin – python code to capture and exposed stats * /opt/sys_monitor/db – contains local sqlite db with latest stat record * /opt/sys_monitor/services – contains Solaris xml service files * /opt/sys_monitor/startup – smf startup helper scripts * /opt/sys_monitor/modules – psutil module (only needed for install) * /opt/sys_monitor/statsSrc – contains the Stats Store Json filesDB related scripts are below
chk_db-ses.sh – get db sessions
The below script will return the number of active system sessions in MySQL which will then be populate to the Stats store(sstore) with the results. cat /opt/sys_monitor/conf/chk_db1_apps-ses.sh#!/usr/bin/env bash ### Export environment variable settings . /opt/sys_monitor/conf/set_env mysql -u root -ppassword -t <checks db query timing
The below script will run a particular query which takes some time to return (you can replace with your own query). We will then populate the Stats store with the time it took to complete the query. cat /opt/sys_monitor/conf/test_db1_apps.sh#!/usr/bin/env bash #set -x ### Export environment variable settings . /opt/sys_monitor/conf/set_env mysql -u root -ppassword -t <Python capture scripts.
Stats python capture script
The below Python capture script captures OS data as well as db data (like the two scripts above). cat /opt/sys_monitor/bin/capture.py#!/usr/bin/env python import os import sys import time import string import sqlite3 import subprocess from multiprocessing import Process import psutil as ps db1_host = "localhost" db_files = {"monitor": { "db_filename": "/opt/sys_monitor/db/monitor.db", "schema_filename": "/opt/sys_monitor/db/sql_schema.sql"}, "db1_qry": { "db_filename": "/opt/sys_monitor/db/db1_qry-monitor.db", "schema_filename": "/opt/sys_monitor/db/db1_qry-sql_schema.sql"} } for x in db_files: db_file = db_files[x]["db_filename"] schema_file = db_files[x]["schema_filename"] db_is_new = not os.path.exists(db_file) with sqlite3.connect(db_file) as conn: if db_is_new: print 'Creating schema' with open(schema_file, 'rt') as f: schema = f.read() conn.executescript(schema) #conn.commit() #conn.close() else: print 'Database ', db_file, 'exists, assume schema ', schema_file, 'does, too.' #sys.exit(1) # sleep at first to start stats time.sleep(1) def qryTime(): start_time = int(time.time()) subprocess.call(['/opt/sys_monitor/conf/test_db1_apps.sh', db1_host], stdout=subprocess.PIPE, shell=False, stderr=subprocess.PIPE) time.sleep(5) end_time = int(time.time()) date_time = end_time db1_qry = end_time - start_time if db1_qry < 3: time.sleep(60) rowid = 1 conn = sqlite3.connect('/opt/sys_monitor/db/db1_qry-monitor.db') cursor = conn.cursor() t = [rowid, date_time, db1_qry ] conn.execute('INSERT OR REPLACE INTO db1Qry values (?,?,?)', t) #print t conn.commit() def statTime(): disks1 = ps.disk_io_counters(perdisk=True) dsk1_0b = disks1["sd1"] dsk1_0c = disks1["sd2"] net1 = ps.net_io_counters(pernic=True) net1_all = net1["net0"] time.sleep(2) date_time = int(time.time()) cpu = ps.cpu_times_percent() mem = ps.virtual_memory() swap = ps.swap_memory() disks2 = ps.disk_io_counters(perdisk=True) net2 = ps.net_io_counters(pernic=True) cpu_usr = int(round(cpu[0],3)) cpu_sys = int(round(cpu[1],3)) cpu_tot = int(round(cpu[0] + cpu[1],3)) # Conversion below - (0, 'B'), (10, 'KB'),(20, 'MB'),(30, 'GB'),(40, 'TB'), (50, 'PB') mem_usd = int(round(mem[3] / 2 ** 20)) mem_tot = int(round(mem[0] / 2 ** 20)) swp_usd = int(round(swap[1] / 2 ** 20)) swp_tot = int(round(swap[0] / 2 ** 20)) dsk2_0b = disks2["sd1"] dsk2_0c = disks2["sd2"] dsk_0b_rop = (dsk2_0b[0] - dsk1_0b[0]) dsk_0b_wop = (dsk2_0b[1] - dsk1_0b[1]) dsk_0b_rmb = (dsk2_0b[2] - dsk1_0b[2]) / 1024 / 1024 dsk_0b_wmb = (dsk2_0b[3] - dsk1_0b[3]) / 1024 / 1024 dsk_0b_rtm = (dsk2_0b[4] - dsk1_0b[4]) dsk_0b_wtm = (dsk2_0b[5] - dsk1_0b[5]) dsk_0c_rop = (dsk2_0c[0] - dsk1_0c[0]) dsk_0c_wop = (dsk2_0c[1] - dsk1_0c[1]) dsk_0c_rmb = (dsk2_0c[2] - dsk1_0c[2]) / 1024 / 1024 dsk_0c_wmb = (dsk2_0c[3] - dsk1_0c[3]) / 1024 / 1024 dsk_0c_rtm = (dsk2_0c[4] - dsk1_0c[4]) dsk_0c_wtm = (dsk2_0c[5] - dsk1_0c[5]) net2_all = net2["net1"] net_smb = (net2_all[0] - net1_all[0]) / 1024 / 1024 / 2 net_rmb = (net2_all[1] - net1_all[1]) / 1024 / 1024 / 2 ses_c = subprocess.Popen(['/opt/sys_monitor/conf/chk_db1_apps-ses.sh', db1_host], stdout=subprocess.PIPE, shell=False, stderr=subprocess.PIPE) stdout = ses_c.communicate()[0] db1_ses = filter(type(stdout).isdigit, stdout) rowid = 1 conn = sqlite3.connect('/opt/sys_monitor/db/monitor.db') cursor = conn.cursor() t = [rowid, date_time, cpu_usr, cpu_sys, cpu_tot, mem_usd, mem_tot, swp_usd, swp_tot, dsk_0b_rop, dsk_0b_wop, dsk_0b_rmb, dsk_0b_wmb, dsk_0b_rtm, dsk_0b_wtm, dsk_0c_rop, dsk_0c_wop, dsk_0c_rmb, dsk_0c_wmb, dsk_0c_rtm, dsk_0c_wtm, net_smb, net_rmb, db1_ses ] conn.execute('INSERT OR REPLACE INTO monitor values (?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?)', t) #print t conn.commit() def chkDb(): while True: qryTime() def chkStats(): while True: statTime() if __name__=='__main__': p1 = Process(target = chkDb) p1.start() p2 = Process(target = chkStats) p2.start()Note: The above script will capture in parallel all stats, i.e. even if one stat takes longer to complete.Stats publish script (listen socket), for remote use
The below script fetches the results from the Sqlite DB, and the SQlite DB keeps on getting updated by the above script. The fetch script is below, it will also create the SQlite tables. /opt/sys_monitor/bin/get_results.py#!/usr/bin/env python import os import sys import time import json import socket import sqlite3 import threading import psutil as ps host = (socket.gethostname()) port = 19099 backlog = 10 size = 1024 def getDBData(): #db_is_new = not os.path.exists(db_filename) rowid = 1 db1qry_file = '/opt/sys_monitor/db/db1_qry-monitor.db' db = sqlite3.connect(db1qry_file) db.row_factory = sqlite3.Row conn = db.cursor() conn.execute('''SELECT * FROM db1Qry WHERE rowid=1''') for row in conn: db1_qry = row['db1_qry'] lst_qry = row['date_time'] mon_file = '/opt/sys_monitor/db/monitor.db' db = sqlite3.connect(mon_file) db.row_factory = sqlite3.Row conn = db.cursor() conn.execute('''SELECT * FROM monitor WHERE rowid=1''') for row in conn: data = { "date": {"date_time": row['date_time']}, "cpu": {"cpu_usr": row['cpu_usr'], "cpu_sys": row['cpu_sys'], "cpu_tot": row['cpu_tot']}, "memory": {"mem_usd": row['mem_usd'], "mem_tot": row['mem_tot']}, "swap": {"swp_usd": row['swp_usd'], "swp_tot": row['swp_tot']}, "disk": {"dsk_0b_rop": row['dsk_0b_rop'], "dsk_0b_wop": row['dsk_0b_wop'], "dsk_0b_rmb": row['dsk_0b_rmb'], "dsk_0b_wmb": row['dsk_0b_wmb'], "dsk_0b_rtm": row['dsk_0b_rtm'], "dsk_0b_wtm": row['dsk_0b_wtm'], "dsk_0c_rop": row['dsk_0c_rop'], "dsk_0c_wop": row['dsk_0c_wop'], "dsk_0c_rmb": row['dsk_0c_rmb'], "dsk_0c_wmb": row['dsk_0c_wmb'], "dsk_0c_rtm": row['dsk_0c_rtm'], "dsk_0c_wtm": row['dsk_0c_wtm']}, "network": {"net_smb": row['net_smb'], "net_rmb": row['net_rmb']}, "db1": {"lst_qry": lst_qry, "db1_qry": db1_qry, "db1_ses": row['db1_ses']} } db.close() return json.dumps(data, indent=4, sort_keys=True) class ThreadedServer(object): def __init__(self, host, port): self.host = host self.port = port self.sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM) self.sock.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1) self.sock.bind((self.host, self.port)) def listen(self): self.sock.listen(backlog) while True: client, address = self.sock.accept() client.settimeout(15) threading.Thread(target = self.listenToClient,args = (client,address)).start() def listenToClient(self, client, address): while True: try: data = client.recv(1024) if data == "get_stat": new_data = getDBData() #print new_data client.send(new_data) else: new_data = getDBData() #print new_data client.send(new_data) client.close() return False except: client.close() client.close() if __name__ == "__main__": ThreadedServer(host,port).listen()Local SQLite DB schema files
Note: the DB will auto get created at run time.DB1 schema file
cat /opt/sys_monitor/db/db1_qry-sql_schema.sqlcreate table db1Qry ( rowid integer primary key, date_time integer, db1_qry integer );General monitor schema file
cat /opt/sys_monitor/db/sql_schema.sqlcreate table monitor ( rowid integer primary key, date_time integer, cpu_usr integer, cpu_sys integer, cpu_tot integer, mem_usd integer, mem_tot integer, swp_usd integer, swp_tot integer, dsk_0b_rop integer, dsk_0b_wop integer, dsk_0b_rmb integer, dsk_0b_wmb integer, dsk_0b_rtm integer, dsk_0b_wtm integer, dsk_0c_rop integer, dsk_0c_wop integer, dsk_0c_rmb integer, dsk_0c_wmb integer, dsk_0c_rtm integer, dsk_0c_wtm integer, net_smb integer, net_rmb integer, db1_ses integer );Create SMF startup helper scripts
Capture startup script
cat /opt/sys_monitor/startup/capture_startup.sh#!/bin/bash case $1 in start) cd /opt/sys_monitor/bin nohup ./capture.py & ;; stop) kill -9 `ps -ef |grep capture.py|grep -v grep|awk '{print $2}'` exit 0 ;; *) echo "Usage: $0 [start|stop]" ;; esacpublish results startup script
cat /opt/sys_monitor/startup/get_results.sh#!/bin/bash case $1 in start) cd /opt/sys_monitor/bin nohup ./get_results.py & ;; stop) kill -9 `ps -ef |grep get_results.py|grep -v grep|awk '{print $2}'` exit 0 ;; *) echo "Usage: $0 [start|stop]" ;; esacCreate SMF services
Capture service
cat /opt/sys_monitor/services/capture_service.xmlStats Capture publish results
cat /opt/sys_monitor/services/getresults_service.xmlStats Result Now we are ready to start capturing
Lets import the services
svccfg import /opt/sys_monitor/services/capture_service.xml svccfg import /opt/sys_monitor/services/getresults_service.xml #start the service svcadm enable svc:/application/stats_capture:default svc:/application/stats_result:defaultVerify all is working
Example of output is below
telnet db1 19099 or curl http://db1:19099 Trying 10.10.10.150... Connected to db1 Escape character is '^]'. { "cpu": { "cpu_sys": 2, "cpu_tot": 13, "cpu_usr": 11 }, "date": { "date_time": 1472679323 }, "disk": { "dsk_0b_rmb": 10, "dsk_0b_rop": 1327, "dsk_0b_rtm": 326, "dsk_0b_wmb": 0, "dsk_0b_wop": 134, "dsk_0b_wtm": 4, "dsk_0c_rmb": 11, "dsk_0c_rop": 1432, "dsk_0c_rtm": 334, "dsk_0c_wmb": 0, "dsk_0c_wop": 134, "dsk_0c_wtm": 4 }, "db1": { "db1_qry": 24, "db1_ses": 2107, "lst_qry": 1472679318 }, "memory": { "mem_tot": 261120, "mem_usd": 205772 }, "network": { "net_rmb": 2, "net_smb": 3 }, "swap": { "swp_tot": 92159, "swp_usd": 0 } }Note: in many cases you need to update the below seeings in the capture scripts - /opt/sys_monitor/bin/capture.py Your device can be seen with iostat -xc Your network device can be seen with ipadm or dladmdb1_host dsk1_0b = disks1["ssd428"] dsk1_0c = disks1["ssd428"] net1_all = net1["db1"]Next, part 3, I will show you how to publish the stats in analytics, Click here to go to part 3. You might also like - Articles related to Oracle Solaris 11.4/Solaris 12. Like what you're reading? please provide feedback, any feedback is appreciated.
0
0
votes
Article Rating