Zenoss's Frequently Asked Questions page is a central hub where its customers can always go to with their most common questions. These are the 581 most popular questions Zenoss receives.
Applies To
Zenoss 5.x
Zenoss 6.x
Summary
When it becomes necessary to perform maintenance on Zenoss Resource Manager (RM), it is useful to perform controlled, incremental shutdowns and startups for Control Center (CC) and Resource Manager. This KB discusses the best practices for performing these tasks.
Procedures
Controlled Shutdown Procedure
Stop RM and top-level applications.
The RM application shouldbe stopped before stopping Control Center. This will allow for a controlled startup of RM services after Control Center is restarted.
Use the following command to stop RM:
# serviced service stop Zenoss.resmgr
Wait for all of the applications to stop. To determine if the applications are stopped, watch the status of services until all have a state of Stopped. The following command lists every service that is NOT stopped. It is safe to ignore services that have no status - these are the folder services that do not run anything:
# serviced service status | grep -v Stopped
Stop the CC master:
# systemctl stop serviced
Stop all resource pool workers (formerly known as CC agents).
On each resource pool worker node, stop serviced:
# systemctl stop serviced
Remove stray docker containers.
NOTE: Perform this step on both the master and the resource pool worker nodes.
Although stopping serviced is normally enough to stop all docker containers on the node, sometimes stray containers remain. Because stray containers can prevent NFS from unmounting in the next step, ensure all containers are stopped and no strays remain.
Determine if there are remaining containers, including those in an Exited status:
# docker ps -a
If any containers remain, use the following command to remove them:
# docker ps -qa | xargs --no-run-if-empty docker rm -fv
rm Command Hangs
In some edge cases, that last command can hang because a container will not die, most frequently because of an NFS hang. To resolve the issue and stop the container, perform the following:
Stop NFS:
# systemctl stop nfs
Stop docker:
# systemctl stop docker
Start NFS:
# systemctl start nfs
Start docker:
# systemctl start docker
Kill remaining container(s):
# docker ps -qa | xargs --no-run-if-empty docker rm -fv
NOTE: If a container continues to persist, a last resort is to reboot the resource pool worker node.
Unmount all resource pool NFS mount points.
NOTE: This step applies to the resource pool worker nodes.
When serviced and the docker containers are stopped, umount the NFS mount points. This prevents the possibility of any stale NFS mount errors on the resource pool worker nodes in cases where the masters storage has to be completely replaced.
Check for active mounts:
# grep serviced /proc/mounts
A typical agent-side mount has the format
<ccMasterIP>:/serviced_volumes_v2/<tenantID> /opt/serviced/var/volumes/<tenantID>
For example:
1.8.1.4:/serviced_volumes_v2/erjcdpennn9yqfcg82hfso6q7 /opt/serviced/var/volumes/erjcdpennn9yqfcg82hfso6q7
If there are no serviced mounts, umounting is complete. If mounts exist for serviced, they must be removed. Proceed to using the force option in step b.
Note: If the master has multiple tenant applications, there can be multiple mounts.
Use the force option to unmount the volumes:
# umount -f /opt/serviced/var/volumes/<tenantID>
Confirm the mount point is removed. Consult /proc/mounts. Note that in some edge cases, a simple umount does not work and the workaround is to do a lazy unmount and restart NFS:
# umount -f -l /opt/serviced/var/volumes/<tenantID>
# systemctl restart nfs
If the umount still fails, reboot the resource pool worker node.
Controlled Startup Procedure
This procedure helps avoid a chaotic startup in a large production environment. This enables the rapid isolation and resolution of any problems that appear during startup/restart. In terms of order of operations, there are many variations concerning which and what order RM services can be started. The key point of this procedure is to defer starting the collection services until all other RM services required to support the collectors are up and running.
Start CC Master
Determine if the CC deployment is using a Zookeeper ensemble. To determine if anensembleis configured, inspect the /etc/default/serviced file and look for the definition:
SERVICED_ISVCS_ZOOKEEPER_QUORUM
If the definition SERVICED_ISVCS_ZOOKEEPER_QUORUM is not set, this means the deployment is not using a Zookeeper ensembleand the CC master can be started.
If the definition SERVICED_ISVCS_ZOOKEEPER_QUORUM is set, this means the deployment is using a Zookeeper ensembleand the CC master can not be started without the other nodes that make up the ensemble. Before starting the CC master, Control Center must be started on the resource pool worker nodes defined by that quorum string.
Start the CC master:
# systemctl start serviced
Monitor the CC log file to verify successful startup:
# journalctl -u serviced -o cat -f
Restart Worker Nodes and RM
For the following steps, verify that RM services for each step are started and show clean health checks before proceeding to the next step. If the services do not start, or do not pass their health checks, stop and resolve the underlying issue before continuing.
Restart Resource Pool Worker Nodes
Login to the resource pool worker nodes and restart serviced.
Note: This step is not required for any resource pool worker node started in the previous step to enable the Zookeeper ensemble.
Monitor the Hosts page in the CC UI to verify all hosts are up and running.
Review IP Assignments
In cases where the system has been restarted following a restore from backup, IP assignments may need to be defined. To review the assignments:
Navigate to the Zenoss.resmgr page under Applications in the CC UI.
Review the IP Assignments table.
If all of the services already have an IP assignment, continue to the next step.
If any service does not have IP assignment, add an IP assignment for that service. If Automatic assignment does not add the appropriate IP assignments, use a manual IP assignment.
Start Services & Metrics
Start the services under HBase. HBase is the first set of services to start because OpenTSDB and the rest of the performance metric pipeline depends on the HBase services.
Start the remaining services under Infrastructure.
Start the services underEvents.
Start the services underUser Interface.
Start the zproxy service. ZProxy is the topmost service (this is namedZenoss.resmgr in most installations). Clicking the Start button for Zenoss.resmgrservice in the UI provides the choice of starting just that single service, or that service and all of its children. Choose to only start the single Zenoss.resmgr service.
Start the services under Metrics.
Start the service(s) namedzenhub.
Start the remaining collector services.
Log into RM and spot check data collection and events to verify everything is working properly.
View ArticlePre-Requisites
Access to a machine with curl installed that can reach the web interface of the Zenoss server.
Login credentials for a user account with the Manager, ZenManager, or ZenOperator role assigned.
Applies To
Zenoss 5.x
Zenoss 4.x
Summary
The following are examples of creating and closing events using the curl command from a Linux terminal. These examples programmatically create and close events on your Zenoss system.
For these examples:
Login credentials are admin:zenoss
Zenoss Resource Manager system is sandbox411.zenoss.loc
zope/zenwebserver listens on port 8080.
Note: the two examples are formatted differently to demonstrate two ways of escaping characters in bash. Both are correct; use whichever method you prefer.
Create an event:
curl -u "admin:zenoss" -X POST -H "Content-Type:application/json" -d "{\"action\":\"EventsRouter\", \"method\":\"add_event\", \"data\":[{\"summary\":\"test55\", \"device\":\"test-rhel6.zenoss.loc\", \"component\":\"\", \"severity\":\"Critical\", \"evclasskey\":\"\", \"evclass\":\"/App\"}], \"type\":\"rpc\", \"tid\":1}" "sandbox411.zenoss.loc:8080/zport/dmd/evconsole_router"
Close an event:
curl -u "admin:zenoss" -X POST -H "Content-Type:application/json" -d '{"action":"EventsRouter","method":"close","data":[{"evids":["0050568a-2045-b48a-11e2-708c5755a8dd"],"params":"{\"severity\":[5,4,3,2,1,0],\"eventState\":[0,1,2]}","limit":1}],"type":"rpc","tid":1}' "sandbox411.zenoss.loc:8080/zport/dmd/evconsole_router"
UUID Returned
The uuid returned in the response from the JSON call to create the event is a uuid of the JSON call and is not directly tied to any uuid generated by the event processing system. As such, the uuid for the event will need to be determined before a curl command can be created to close the event. To identify the event's uuid, create a filter and send another request for events that match your filter.
For more information:
For more information, see our JSON API documentation @ http://wiki.zenoss.org/Working_with_the_JSON_API
View ArticleVersion 1.0.0 of the Cloud View ZenPack (ZenPacks.zenoss.CloudView) has been released.
This release is compatible with Zenoss 6.3 - 6.4 and Zenoss Cloud, and has the following additional requirements
ZenPacks.zenoss.ZenPacklib 2.1.0 or higher
This is the first release of the Cloud View ZenPack.
Detailed information on this ZenPack is available from the ZenPack Catalog.
View ArticlePre-Requisites
Permissions to edit device templates
Access to zendmd highly recommended
Applies To
Zenoss 4.2.x
Zenoss 4.1.1
Zenoss 4.1.0
Zenoss 4.0.2
Summary
It is oftennecessary to set up a threshold based on a percentage instead of a raw value. To do this, we need two things:
The current value
Either the minimum or maximum value possible
Because component templates provide us with some of these minimum or maximum values, it is relatively easy to accomplish in some cases. The information needs to be called from wherever it exists in the object database.
Procedure
Component Template Example:
Component templates provide us with a lot of valuable information that we can use in creating thresholds. One of the most common thresholds that support is asked to assist in creating is memory utilization based on percentage. In this example, we're using a device in the /Server/SSH/Linux device class. We have a few relevant data points available, both for different things:
mem.MemFree
The above data point gives us the amount of free (unused) memory available on the system in bytes.
mem.SwapFree
The above data point gives us the amount of free (unused) swap available on the system in bytes.
We can create a threshold based on memory or swap utilization using the maximum values that the component template gathers during modeling. To create a threshold that generates an event based on utilization of either memory or swap, create a MinMaxThreshold based off whichever data point you want to use, and then set the threshold's minimum or maximum include attribute that the component template gathered.
For this example, setting the Minimum value of a threshold based on the data point MemFree to be 10% of the total memory looks like this:
here.hw.totalMemory * .1
The above line references here.hw.totalMemory -- an attribute in the object database that was gathered by the component template during modeling. At the beginning, "here" refers to the device being monitored. The "hw" organizer exists in the object database, and "totalMemory" is one of the attributes stored in it. The "totalMemory" attribute is the total installed physical memory of the system in bytes. We want to multiply it by ".1" to get 10% of the maximum memory. That way, if the memory available drops below 10% of total, we get an event.
We can do the same for swap:
here.os.totalSwap * .1
Note that the above line references a different organizer, os, from which the value totalSwap is pulled.
View ArticleSummary
This document clarifies when the End of Maintenance (EOM) occurs for a particular release.
Details
The End of Maintenance date is stated in the Description of Software, Support, and Services (DOSSS) as the following:
Zenoss is not required to provide maintenance for a version of the software after 12 months following the release of the subsequent version of the software.
For each major version, maintenance releases are only provided for the most recently released minor version.
Based on the release dates of our solutions, this breaks down to the following:
Release
Generally Available
End of Maintenance
Zenoss 2.4
2009-05-04
2010-10-26
Zenoss 2.5
2009-10-26
2011-07-15
Zenoss 3.0
2010-07-15
2012-02-14
Zenoss 3.1
2011-02-14
2012-09-01
Zenoss 3.2
2011-09-01
2012-10-27
Zenoss 4.0
2011-10-27
2012-11-14
Zenoss 4.1.1
2011-11-14
2013-12-03
Zenoss 4.2.2
2012-12-03
2014-03-05
Zenoss 4.2.3
2013-03-05
2014-09-30
Zenoss 4.2.4
2013-09-30
2018-06-30
Zenoss 4.2.5
2014-05-28
2018-06-30
Zenoss 5.0
2015-02-15
2016-06-30
Zenoss 5.1
2016-03-07
2017-12-31
Zenoss 5.2
2016-11-30
2018-03-31
Zenoss 5.3
2017-08-17
2018-11-08
Zenoss 6.0
2017-11-08
2018-03-31
Zenoss 6.1
2018-01-09
2018-06-30
Zenoss 6.2
2018-06-13
2019-01-31
Zenoss 6.3
2019-01-05
2019-08-23
Zenoss 6.4
2019-08-23
Current Release
Maintenance for all ZenPacks supported with a specific release is tied to the specific release EOM date. This includes Service Impact and Analytics.
FAQ
Q: What does "End of Maintenance" mean to me as a user?
A: When a product reaches "End of Maintenance", no more patches will be released for that version. End of Maintenance releases will continue to operate, and support will be provided for the release.
View ArticleVersion 1.3.3 of the Cisco UCS Central ZenPack (ZenPacks.zenoss.CiscoUCSCentral) has been released.
This release is compatible with Zenoss 6.2 - 6.4 and Zenoss Cloud, and has the following additional requirements
CiscoUCS ZenPack
PythonCollector ZenPack
The following changes have been made since the previous release: 1.3.2
Fix import errors in CiscoUCSCentral due to removed dependencies in CiscoUCS 3.0.2 (ZPS-6427)
This version has been tested with Zenoss Cloud and Service Impact 5.5. Detailed information on this ZenPack is available from the ZenPack Catalog.
View ArticleVersion 2.1.0 of the Microsoft Azure ZenPack (ZenPacks.zenoss.Microsoft.Azure) has been released.
This release is compatible with Zenoss 6.2 - 6.4 and Zenoss Cloud, and has the following additional requirements
ZenPacks.zenoss.PythonCollector >= 1.2
ZenPacks.zenoss.ZenPackLib>= 2.1.1
The following changes have been made since the previous release (2.0.0):
Add new Application Insights and Azure Function components
Fix events for Service Plan components in case all resources are stopped (ZPS-5652)
Fix for Random Monitoring of Instance failed or GET Azure metrics unsuccessful failures events (ZPS-6069)
Fix potential KeyError traceback when modeling (ZPS-6211)
Ensure complete component model after every modeling attempt (ZPS-6258)
Update migration scripts to avoid UI flares for some specific Zenoss configurations (ZPS-5737)
This release has been tested withTested with Zenoss 6.3.x, Zenoss 6.4.x, Zenoss Cloud and Service Impact 5.5.1.Detailed information on this ZenPack is available from the ZenPack Catalog.
View ArticleApplies To
Zenoss 4.x
Zenoss 3.x
Zenoss 2.x
Summary
Modeling refers to discovering information on a device. Modeling information is gathered by the zenmodeler daemon. This information can include basic configuration such as hardware model, OS information, serial numbers, and software inventory. It can also include components such as network interfaces and filesystems. After a device is modeled, performance monitoring daemons can use the modelling information to determine what to monitor, such as which filesystems to monitor or exclude from monitoring and how to monitor, for example using SNMP index. Note: The ZenVMware zenpack is a special case that uses the zenvmwaremodeler daemon, and is not discussed here).
The zenmodeler daemon chooses the modeler plugins (these are called collector plugins in the GUI according to the zCollectorPlugins Z-property prior to version 3.x). The daemon then runs the plugins against the device to gather the modeling information. The modeling information is then passed to zenhub to make changes to the ZODB database. By default, modeling runs every 12 hours, starting from the time when the zenmodeler daemon starts. Because modeling can be processor / disk / resource intensive for both devices as well as the Zenoss server, manually scheduling the modeling process can be helpful in managing resources.
Procedures
The following procedures describe how to configure the modelling frequency for various Zenoss versions. Follow the instructions for your Zenoss version.
For Zenoss version 4.x
Perform the following to configure the modelling frequency using crontab:
Stop the zenmodeler daemon on all collectors:
From the command line: Run the following command as the zenoss user: $ zenmodeler stop
From the Zenoss UI:
Navigate to Advanced > Daemons on the left hand navigation bar.
Click the Stop button for the zenmodeler daemon.
Prevent zenmodeller from running when Zenoss starts up. Remove zenmodeler from the files that control zenmodeler startup:
Edit the $ZENHOME/etc/daemons.txt file. Also the /$ZENHOME/etc/collectordaemons.txt:, if you are using it on the collectors.
Locate and delete the zenmodeler line.
Save and exit the file(s).
Update the collectors to pick up the change. In the Zenoss UI:
Navigate Advanced > Collectors.
Select the remote collectors you want update from the right hand pane.
Select the menu item Update Collector...
Create a cron job on each collector for the zenoss user for zenmodeller. Add entries to crontab to schedule zenmodeler: Modify a crontab in /etc to determine when zenmodeller should run. For example, edit /etc/cron.daily/zenoss to add the job: # Remodel Zenoss devices on this collector. 0 0 * * * bash -lc '/opt/zenoss/bin/*_zenmodeler run --daemon --now' Note: Use the "*_ " string if you have multiple collectors. Each collector has a collector1_zenmodeler collector2_zenmodeler. Remove the " *_" string if you have a single collector on the master.
For Zenoss versions 2.3.x, 3.x
Perform the following to change the modelling frequency using cron:
Stop the zenmodeler daemon.
To stop the currently-running zenmodeler daemon on each collector (you can only have one collector).
From the command line: Run the following command as the zenoss use: $ zenmodeler stop
For version 3.x:
Navigate to Advanced > Daemons on the left hand navigation bar.
Click the Stop button for the zenmodeler daemon.
For version 2.3-2.5.2:
Navigating to Settings > Daemons tab.
Click the Stop button for the zenmodeler daemon.
Stop ZenModeler from being automatically restarted in the future. Remove the zenmodeler line from the $ZENHOME/etc/daemons.txt file so that it does not start automatically with the rest of Zenoss.
In the RM UI, go to Advanced -> Events -> Clear Heartbeats. This will stop you from receiving heartbeat failures for the zenmodeler daemon(s) that are now stopped.
Add a cronjob to schedule when ZenModeler runs.
Add a cronjob for the zenoss user on each collector to schedule when you want the modeler to run. You'll want cronjobs like the following on each collector. Modify a crontab in /etc to determine when you want it to run. For example, edit /etc/cron.daily/zenoss to add the job: # Remodel Zenoss devices on this collector. 0 0 * * * bash -lc '/opt/zenoss/bin/*_zenmodeler run --daemon --now' Note: Use the *_ if you have multiple collectors. Each collector has a collector1_zenmodeler collector2_zenmodeler. If you have a single collector on the master, you can remove the " *_" string.
View ArticleThe CiscoUCS ZenPack (ZenPacks.zenoss.CiscoUCS) has been updated to 3.0.2 from 3.0.1.
Dependencies:
Calculated Performance ZenPack
CiscoMonitor Zenpack
Dashboard ZenPack
Dynamic Service View ZenPack
Predictive Threshold ZenPack
PythonCollector ZenPack
ZenPackLib ZenPack
Changes:
Fix duplicates of fan module components. (ZPS-6047)
Fix CIMC session leakage. (ZPS-6110)
Fix "AttributeError: search" tracebacks during graph update. (ZPS-6194)
Fix zenchkrels errors for service profiles. (ZPS-6338)
Fix Authentication failure (Authorization Required 552). (ZPS-5921)
Tested with Zenoss Cloud, Zenoss 6.4.1 and Service Impact 5.5.1.
Detailed information on this ZenPack is available from the ZenPack Catalog.
View ArticleVersion 2.1.1 of the EMC (ZenPacks.zenoss.EMC.base) ZenPack has been released.
This release is compatible with Zenoss 6.2 - 6.4 and Zenoss Cloud, and has the following additional requirements.
ZenPacks.zenoss.WBEM >= 2.0.1
ZenPacks.zenoss.PythonCollector >= 1.5.2
ZenPacks.zenoss.StorageBase >= 1.3.0
ZenPacks.zenoss.ZenPackLib >= 2.0.4
The following changes have been made since the previous release: 2.1.0
Compatibility with WEBM ZenPack 3.0.0
Updated Storage Pool graphs to reflect collected data and remove deprecated, unused charts (ZPS-3572)
Updated Celerra System modeling to no longer break when processing Virtual Data Movers (ZPS-3844)
Added documentation details for wbemStatsSampleInterval, which is defined on the EMC device (ZPS-3605)
Corrected links for ZC instances which had invalid links from one component to another (ZPS-4400)
Tested with Zenoss 6.3.2, Cloud and Service Impact 5.5
Detailed information on this ZenPack is available in the ZenPack Catalog.
View ArticleThe Dell EMC Isilon ZenPack (ZenPacks.zenoss.EMC.Isilon), has been updated to 1.0.2
This release is compatible with Zenoss versions 6.3 - 6.4 and Zenoss Cloud.
The following changes have been made since the previous release: 1.0.1
Fix uncatalog errors when deleting node devices (ZPS-6246)
Fix most uncorroborated edges and multiple providers for Impact (ZPS-6214)
Correct zenbatchdump output (ZPS-2900)
IpInterface method override output corrected (ZPS-6182)
Reduce logging verbosity (ZPS-4975)
This release has been tested with Zenoss 6.4.1, Zenoss Cloud with Service Impact 5.5.1
Detailed information on this ZenPack is available from the ZenPack Catalog.
View ArticleVersion 1.4.4 of the StorageBase (ZenPacks.zenoss.StorageBase) ZenPackhas been released.
This release is compatible with Zenoss versions 6.3 - 6.4 and Zenoss Cloud andhas the following additional requirements.
ZenPacks.zenoss.DynamicView
The following changes have been made since the previous release:1.4.3.
Do not autogenerate Storage reports. (ZPS-4174)
Shows a spinner wheel when generating Storage reports. (ZPS-4304)
This release has been tested with Zenoss 6.4.1, Zenoss Cloud and Service Impact 5.5.1
Detailed information on this ZenPack is available from the ZenPack Catalog.
View ArticleApplies To
Zenoss 5.1 and greater
Summary
It is important to have accurate and tested system backups because they can mitigate problems caused by software or hardware issues. One method to backup Control Center and its installed applications is to use serviced backup. Backups can be performed via the command line or through automated cron jobs (crontab).
The serviced backup command saves a backup of the current system, including the state of all services, and data to a compressed archive file (.tgz). Because serviced backup leverages functionality provided by serviced snapshot, backups can be performed without the need to shutdown zenoss or the docker containers because they are only momentarily suspended to enable reading the data.
NOTE: It is important that there is enough free space to receive and store backups because running low on available disk space will result in errors and impact system performance.
About serviced backup
The default directory that receives backups from serviced backup is /opt/serviced/var/backups.
The serviced backup command includes various options to tailor the command for specific use:
USAGE
serviced backup [command options] [arguments...]
DESCRIPTION
serviced backup DIRPATH
OPTIONS
--exclude '--exclude option --exclude option'
Subdirectory of the tenant volume to exclude from backup
--help, -h
Shows the help for an option
Procedures
Backups
Backups can be initiated from the UI or the command line. Zenoss stores its backups by default in the directory /opt/serviced/var/backups.
Best practices for backup include:
Weekly backups for production environments.
Increasing backup frequency to daily during initial deployment or for development environments.
Backup Using Control Center
To backup the system through the Control Center UI, navigate to Control Center > Backup/Restore and click the Create Backup button.
If a backup is run through the UI, the path is controlled by an entry in the /etc/default/serviced file:
# SERVICED_BACKUPS_PATH=/opt/serviced/var/backups
Change this entry to specify a new target directory for the UI initiated backups.
Backup Using the Command Line
To backup your system through the command line:
As root or a valid Control Center user, backup the system to the default directory /opt/serviced/var/backups with the following command:
serviced backup /opt/serviced/var/backups
A successful backup is indicated by the system displaying the new backup file name, for example:
backup-2016-09-29-163201.tgz
Automating Backup with CentOS/RHEL 7
For automating system backup on CentOS/RHEL 7, crontab alone is not used, instead, bash scripts are used to automate cron, fstrim, zenossdbpack etc. Zenoss is configured with various bash scripts for this purpose.
For examples of how to create appropriate automation scripts, see the files in/etc/cron.daily/, /etc/cron.hourly/ and /etc/cron.weekly/ in the current installation.
Note: If automation is used to perform backups, care must be exercised to ensure accumulation of old backups does not consume all disk space. Old backups must be curated to maintain available storage space.
Example serviced-backup File
As an example of a script using serviced backup, add a file, called serviced-backup in this example to the appropriate directory with the following contents:
#!/bin/sh
# Execute a Control Center backup, writing the results to the backups directory
${SERVICED:=/opt/serviced/bin/serviced} backup /opt/serviced/var/backups
View ArticleFor this example we start off by creating a new domain. Creating a new domain is done by clicking the 'Create' drop-down box and selecting Domain. Below we will utilize the value of 'CPUMemory Domain' as it is best practice to include the type of object after the name. Picture: Add New Domain Screen
Another good practice is to keep your save locations different for each type of object {Domain, Ad Hoc View, Report}. Below, the 'Data Source' is 'Create with Domain Designer' which will take us to the next section. For the sample report we choose the following tables:
dim_date
dim_device
dim_group
hourly_load_1
hourly_load_5
hourly_load_15
hourly_mem_avail_real tablesPicture: Selected Tables
The "dim_date" was chosen because we will be utilizing the "measure date_day_of_week" to get the Nth day of the week and we will be using the "measure date_month" to get the Nth month of the year.
The "dim_device" is to add the device name in the table and will serve as the base for the left joins.
The "dim_group" will added to the group name in the table.
The remaining tables will be left joined to gather additional metrics. Another table will be chosen as an additional base for left joins. The reason for the multiple set of left joins is to align the hour time between the metric tables and display the hour the time. We want to choose the table will the most values. Picture: Domain Joins
Above an example of the left joins. The "dim_device:device_key" will be used as much as possible to link the other tables to the device. The left join is used because we do not want to eliminate any rows we do not have values for. The second join set is "dim_date:date_key" which is used to line the metrics to the correct day and since this table does not have hour field we will use a third table. The third table base, which is our hand picked metric "table:fct_ts_gmt" is used to align the metric collections based on hour. Our goal is for the collection time to line up across the report for the different metrics we are using.On the Display section of the Domain creation we want to select all the fields or the one join set item. It is recommend to update the name of the fields that are the same before finishing the report. In our example we will update all the "fct_avg" fields with their corresponding table name which will allow easier identification in the ad hoc view creation.Picture: Updating Identification
Update any of the field's id and description you may want to use in the ad hoc view as this will make the completion of an ad hoc view easier. When we are done, we will click submit.
To create an Ad Hoc View, click on the top 'Create' and choose 'Ad Hoc View'. Next we want to choose our domain we created or an existing domain. Picture: Selecting the domain
We are choosing "CPUMemory Domain" and selecting all the fields. After we have completed the setup dialogs we will be configuring the Ad Hoc View page. To create calculated measures, click on the drop-down box to the right of the title 'Measures' and choose 'Create Calculated Measure...'
Picture: Creating a Calculated MeasureOne of the measures we want to create is the hour in which the data collection was performed. We can do this by using the formula: "hour("fct_ts_gmt")" and "fct_ts_gmt" which comes from the the base of our third join. You may also be able to use "fct_ts_gmt."Picture: Creating Hour of Day
We want to put a time stamp in our report for the metric, which can be done with the "any_metric:fct_ts_gmt" and "hourly_mem_avail_real:fct_ts." To clean up the view right click on the column and update the name. We will also change the format by right clicking on the column, choosing 'Change Date Format', and selecting "Apr 4, 2019 5:12:46 PM."Picture: Changing Data Format for Hour - Picking the fieldPicture: Changing Data Format for Hour - Changing the format
Next we will setup the column Day of Week by selecting the "measure dim_date:date_day_of_week." To setup the numeric value of the month we will use the "measure dim_date:date_month."Picture: Measure Day of Week and Month
The remaining fields are device name, group name, and metrics. The following pictures displays the corresponding fields.
Picture: Adding the device name
Picture: Add the device group
Picture: Choosing metrics for Memory
Create any filters you are wanting to use by renaming the columns, and when you are done save the Ad Hoc View. Next clicking the floppy icon, choosing Save Ad Hoc View As ..., choosing the name "CPUMemory Ad Hoc View," saving to the AdHocView folder, and clicking Save.
View ArticleApplies To
Resource Manager 5.1.x
Resource Manager 6.x
Summary
The Resource Manager event console is a powerful tool for sorting and filtering events from your infrastructure. However, filtering and sorting can only be performed against event fields that have been indexed. If you find it necessary to index a new field, the following steps will take you through the process.
Procedure
To begin, we will create a ZenPack to contain your indexed event details. Create your new ZenPack from the Control Center command line and note that the ZenPack name given is for example purposes only. Please note that this step may take several minutes and will exit back to the command line when complete.
serviced service run zope zenpack-manager create ZenPacks.companyName.NewEventAttributes
Launch a new zope shell:
serviced service shell zope
Change to the zenoss user:
su - zenoss
Change to the zep directory in the new ZenPack with
cd /opt/zenoss/ZenPacks/ZenPacks.companyName.NewEventAttributes/ZenPacks/zenoss/NewEventAttributes/zep/
Copy the zep.json.example to zep.json:
cp zep.json.example zep.json
Edit the zep.json file to include your indexed event details, like so:
{
"EventDetailItem": [
{
"key": "rig",
"type": "1",
"name": "Rig"
},
{
"key": "host",
"type": "1",
"name": "Host"
},
{
"key": "app",
"type": "1",
"name": "App"
}
]
}
EventDetailItem is the required reserved keyword and is a list of dictionaries where each dictionary specifies a field. The key attribute must match the name of the event attribute; the name field is the user-friendly name used in the GUI. Help for the type field is given in the sample zep.json.example, where a type of 1 defines a string value. Additional information on event detail types can be found in the zep.json.example file.
In the above example, an event transform creates evt.rig, evt.host and evt.app. The GUI Event Console will allow you to select Rig, Host and App as event columns to filter on. Note the capitalization difference; these are the "friendly" names defined in the "name" fields above.
Build your egg and move it to your home directory:
cd $ZENHOME/ZenPacks/ZenPacks.companyName.NewEventAttributespython setup.py bdist_eggcd distcp ZenPacks.companyName.NewEventAttributes* /mnt/pwd
Drop back to the root user and copy the ZenPack to /mnt/pwd.
cp /opt/zenoss/ZenPacks/ZenPacks.companyName.NewEventAttributes/dist/ZenPacks.companyName.NewEventAttributes-1.0.0-py2.7.egg /mnt/pwd/
Exit the container and install the ZenPack:
exitserviced service run zope zenpack-manager install ZenPacks.companyName.NewEventAttributes*
Stop zeneventserver and rebuild the Lucene indices:
serviced service stop zeneventserverTENANT_ID=$(serviced service status | grep Zenoss.resmgr | awk '{print $2}')cd /opt/serviced/var/volumes/$TENANT_ID/zeneventserver/index/rm -rf archive/ summary/serviced service start zeneventserver
Restart Resource Manager:
serviced service restart Zenoss.resmgr
Log into the UI and add the new columns to the Event Console with the Configure > Adjust Columns button.Make sure you scroll down in the Columns to display list - the new fields will be at the bottom.
View ArticleResource Manager 6.4.1 is now Generally Available for customer download. For more information, see the Release Notes.
View ArticleVersion 3.0.0 of the WBEM ZenPack (ZenPacks.zenoss.WBEM) has been released.
This release is compatible with Zenoss Cloud, 6.2 - 6.4, and has the following additional requirements.
ZenPacks.zenoss.PythonCollector (any version)
The following changes have been made since the previous release: 2.1.0.
Updated documentation describing the usage of 'wbemStatsSampleInterval'
Updates to pywbem library
This release was tested with the following Zenoss versions
Zenoss Resource Manager 6.2.1
Detailed information on this ZenPack is available from the ZenPack Catalog.
View ArticleVersion 1.3.1 of the OpenStack (Tenant View) ZenPack (ZenPacks.zenoss.OpenStack) has been released.
This release is compatible with Zenoss Cloud and Zenoss 6.2 - 6.3, and has no furtherrequirements.
The following changes have been made since the previous release.
Compatibility with Impact 5.5
Detailed information on this ZenPack is available from the ZenPack Catalog.
View ArticleVersion 1.1.0 of the Google Cloud Platform ZenPack (ZenPacks.zenoss.GoogleCloudPlatform) has been released.
This release is compatible with Zenoss Cloud, Zenoss 6, and has no additional requirements.
The following changes have been made since the previous release: 1.0.2.
Add support for Dataflow Jobs. (ZPS-5801)
Add support for Cloud Functions. (ZPS-5719)
Support "reducer" in Stackdriver Monitoring datasource. (ZPS-5558)
Support "filter" in Stackdriver Monitoring datasource. (ZPS-5559)
Support "group by" in Stackdriver Monitoring datasource
Add support for Instance Group Monitoring (ZPS-3501)
Add support for Billing Monitoring with services and region analysis (ZPS-5807)
Support for tags for all components (ZPS-5809)
Support for Billing reports based on tags and labels (ZPS-6024)
This release was tested with the following Zenoss versions.
Zenoss Cloud
Zenoss Resource Manager 6.3.x and 6.4
Zenoss Service Impact 5.5.1
Detailed information on this ZenPack is available from the ZenPack Catalog.
View ArticleVersion 1.3.2 of the Cisco UCS Central ZenPack (ZenPacks.zenoss.CiscoUCSCentral) has been released.
This release is compatible with Zenoss 6.2 - 6.3 and Zenoss Cloud, and has the following additional requirements
CiscoUCS ZenPack
PythonCollector ZenPack
The following changes have been made since the previous release: 1.3.1
Fix import errors from CiscoUCS 3.0.0. (ZPS-5865)
This version has been tested with Zenoss Cloud and Service Impact 5.5. Detailed information on this ZenPack is available from the ZenPack Catalog.
View ArticleVersion 3.0.12 of the OracleDatabase Monitor ZenPack (ZenPacks.zenoss.DatabaseMonitor) has been released.
This release is compatible with Zenoss versions 4.2 - 6.1, and has no further requirements.
The following changes have been made since the previous release: 3.0.10.
Fix permissions issues for sysstat datasource (ZPS-2852)
Fix scale issues in Redo Size (ZPS-2876)
Security: Hide all Connection String info from process list (ZPS-2185)
Ensure txojdbc kills java process after timeout (ZPS-2479)
Change tablespace storage graphs to base 1024 (ZPS-2254)
Set SGA and PGA graphs to base 1024 (ZPS-2604)
Fix missing properties and relations in Analytics (ZPS-2597)
Detailed information on this ZenPack is available from the ZenPack Catalog.
View ArticleApplies To
Zenoss 4.x
Summary
Zenoss Resource Manager 4.x uses RabbitMQ for messaging between several application daemons. Resource Manager administrators may therefore wish to be familiar with listing RabbitMQ queues, displaying queue contents, clearing queues, and completing other activities related to troubleshooting RabbitMQ.
Procedure
The following tasks are helpful for displaying the RabbitMQ configuration and are run as the root user on the host system.
To list Zenoss exchanges (where messages are published):
# rabbitmqctl -p /zenoss list_exchanges
To list Zenoss queues(where messages are consumed):
# rabbitmqctl -p /zenoss list_queues
The following tasks are useful for troubleshooting queues and are run as the zenoss user.
To show all messages in the specified queue (the zep raw events queue is used as an example):
$ zenqdump zenoss.queues.zep.rawevents
To empty a queue, a command can be run to display and acknowledge all messages in the queue. In the sample command below, thezep raw events queue is used as an example.This command will not empty the queue as quickly as possible (another command will be detailed below for emptying a queue more quickly):
$ zenqdump -A zenoss.queues.zep.rawevents
The following tasks are useful for troubleshooting queues and are run as the root user.
To clear queues quickly all queues may be reset by recreating the /zenoss vhost. After stopping Resource Manager, run the following:
# rabbitmqctl delete_vhost /zenoss
# rabbitmqctl add_vhost /zenoss
# rabbitmqctl set_permissions -p /zenoss zenoss '.*' '.*' '.*'
The following actions will restore RabbitMQ service after the host server is renamed. Renaming a server after initial creation can cause problems because RabbitMQ uses the server hostname in file and directory names. Additionally, changing the Zenoss server's hostname can cause RabbitMQ to stop working because RabbitMQ ties its NODENAME to the hostname when its configuration is created.As a result, the following commands should be run whenever the server hostname is changed, after stopping Zenoss:
# service rabbitmq-server stop
# mv /var/lib/rabbitmq /var/lib/rabbitmq.old
# mkdir /var/lib/rabbitmq
# chown rabbitmq:rabbitmq /var/lib/rabbitmq
# service rabbitmq-server start
# rabbitmqctl add_user zenoss zenoss
# rabbitmqctl add_vhost /zenoss
# rabbitmqctl set_permissions -p /zenoss zenoss '.*' '.*' '.*'
View Article
Problem Description
Following the upgrade of Resource Manager to 6.3.2 (or on a fresh install), Resource Manager is unable to connect to Analytics due to the update ofPythonurllib3library, from version 1.10.2 to version 1.22 (see the Resource Manager 6.3.2 release notes). This affects users using a self signed certificate.
Users may notice Analytics data not updating or receiving anUnable to talk to Analytics serverflare when navigating in Resource Manager to Reports Configuration. If attempting to submit new or amended configuration data from Resource Manager for Analytics, users may receive aFailed to connect to <Analytics Internal URL> error flare. The following example is what would be found in the zope event log:
2019-02-05T09:33:06 ERROR zen.etl.router Failed to connect to https://zenoss.example.com:443: Request 'https://zenoss.example.com:443/etl/502bc7a8-8450-11e8-817a-0242ac110018' failed: CERTIFICATE_VERIFY_FAILED certificate verify failed (_ssl.c:579)
Users are able to sign on to Analytics using Resource Manager authentication. This only affects the Analytics ETL process.
How to verify urllib3 library is causing your issue
Navigate to Resource Manager Configuration and confirm if you receive the error flare.
Amend a value of the configuration settings and submit. Verify the other message and check the zope logs:
for i in {0..5} ; do serviced service attach Zope/$i cat /opt/zenoss/log/event.log >> event.log ; done
Attach to a zope container and run the following (updating the URL, following is an example from my lab):
python -c "import urllib2;z = urllib2.urlopen('https://zenoss.example.com/');x = z.read()"
Following is a traceback which would confirm the issue:
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/usr/lib64/python2.7/urllib2.py", line 154, in urlopen
return opener.open(url, data, timeout)
File "/usr/lib64/python2.7/urllib2.py", line 431, in open
response = self._open(req, data)
File "/usr/lib64/python2.7/urllib2.py", line 449, in _open
'_open', req)
File "/usr/lib64/python2.7/urllib2.py", line 409, in _call_chain
result = func(*args)
File "/usr/lib64/python2.7/urllib2.py", line 1258, in https_open
context=self._context, check_hostname=self._check_hostname)
File "/usr/lib64/python2.7/urllib2.py", line 1214, in do_open
raise URLError(err)
urllib2.URLError: <urlopen error [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:579)>
If nothing is returned then you're not impacted by the urllib3 issue.
Verify the Analytics certificate hasn't expired and that it has the correct common name
In a zope container, run either/both of the following two commands to check the expiry date of the Analytics certificate (updating to your Analytics URL):
curl -v https://zenoss.example.com:443
openssl s_client -connect zenoss.example.com:443 2>/dev/null | openssl x509 -noout -dates
The output would look like the following:
[root@7e0c9c8dfaf0 /]# curl -v https://zenoss.example.com:443
* About to connect() to zenoss.example.com port 443 (#0)
* Trying 10.90.36.188...
* Connected to zenoss.example.com (10.90.36.188) port 443 (#0)
* Initializing NSS with certpath: sql:/etc/pki/nssdb
* CAfile: /etc/pki/tls/certs/ca-bundle.crt
CApath: none
* Server certificate:
* subject: [email protected],CN=zenoss.example.com,OU=SomeOrganizationalUnit,O=SomeOrganization,L=SomeCity,ST=SomeState,C=--
* start date: Jul 26 14:59:47 2017 GMT
* expire date: Jul 26 14:59:47 2018 GMT
* common name: zenoss.example.com
* issuer: [email protected],CN=zenoss.example.com,OU=SomeOrganizationalUnit,O=SomeOrganization,L=SomeCity,ST=SomeState,C=--
* NSS error -8181 (SEC_ERROR_EXPIRED_CERTIFICATE)
* Peer's Certificate has expired.
* Closing connection 0
curl: (60) Peer's Certificate has expired.
More details here: http://curl.haxx.se/docs/sslcerts.html
curl performs SSL certificate verification by default, using a "bundle"
of Certificate Authority (CA) public keys (CA certs). If the default
bundle file isn't adequate, you can specify an alternate file
using the --cacert option.
If this HTTPS server uses a certificate signed by a CA represented in
the bundle, the certificate verification probably failed due to a
problem with the certificate (it might be expired, or the name might
not match the domain name in the URL).
If you'd like to turn off curl's verification of the certificate, use
the -k (or --insecure) option.
[root@7e0c9c8dfaf0 /]# openssl s_client -connect zenoss.example.com:443 2>/dev/null | openssl x509 -noout -dates
notBefore=Jul 26 14:59:47 2017 GMT
notAfter=Jul 26 14:59:47 2018 GMT
Create a new Analytics certificate
On the Analytics host, create some new working directories:
mkdir /tmp/certUpdate
mkdir /tmp/certUpdate/backup
Create the new certificate (this example sets the expiry date for 10 years from now):
openssl req -x509 -nodes -days 3650 -newkey rsa:2048 -keyout /tmp/certUpdate/localhost.key -out /tmp/certUpdate/localhost.crt
Backup the old certificates:
mv /etc/pki/tls/certs/localhost.crt /tmp/certUpdate/backup/localhost.crt
mv /etc/pki/tls/private/localhost.key /tmp/certUpdate/backup/localhost.key
Copy the new certificates over:
cp /tmp/certUpdate/localhost.key /etc/pki/tls/private/localhost.key
cp /tmp/certUpdate/localhost.crt /etc/pki/tls/certs/localhost.crt
Restart http and the zenoss_analytics service:
systemctl restart httpd
service zenoss_analytics stop
service zenoss_analytics start
Rerun the checks to verify the expiry date has now been updated
Update the certificates for Resource Manager
Launch a zope shell:
serviced service shell -i -s fixAnalyticsCert zope bash
Define your URL (Replace 'zenoss.example.com' with your correct Analytics URL)
ANALYTICS=zenoss.example.com
Run the check to verify the certificate is still bad:
python -c "import urllib2;z = urllib2.urlopen('https://"${ANALYTICS}"');x = z.read()"
Navigate to the CA directory and create the certificate pem file:
cd /etc/pki/ca-trust/source/anchors/
openssl s_client -showcerts -connect ${ANALYTICS}:443 </dev/null 2>/dev/null|openssl x509 -outform PEM > analyticsCertfile.pem
Update the certificate store (do not do by hand and please note there's no output):
update-ca-trust extract
Rerun the check to verify the certificate is okay now:
python -c "import urllib2;z = urllib2.urlopen('https://"${ANALYTICS}"');x = z.read()"
The above should not return anything.
Exit the container and commit the snapshot:
serviced snapshot commit fixAnalyticsCert
Delete the resulting snapshot tag:
serviced snapshot rm <tag>
Restart Resource Manager services:
serviced service restart zenoss.resmgr
Issue faced with Resource Manager 6.3.2 and Analytics v5.0.6 & v5.1.0
Related articles
https://access.redhat.com/articles/2039753#controlling-and-troubleshooting-certificate-verification-6
View ArticleThe CiscoUCS ZenPack (ZenPacks.zenoss.CiscoUCS) has been updated to 3.0.1 from 3.0.0.
Dependencies:
Calculated Performance ZenPack
CiscoMonitor Zenpack
Dashboard ZenPack
Dynamic Service View ZenPack
Predictive Threshold ZenPack
PythonCollector ZenPack
ZenPackLib ZenPack
Changes:
3.0.1
Add impact relationships for Aggregation Pools. (ZPS-5836)
Fixes Domain Overview Portlet. (ZPS-5839)
Correspond UCSProtocol timeout value to the collection interval. (ZPS-5881)
Fixes event transform for UCS Fault. (ZPS-5941)
Fixes incorrect events for PSU's. (ZPS-5939)
Tested with Zenoss Cloud, Zenoss 6.4.0 and Service Impact 5.5.1
Detailed information on this ZenPack is available from the ZenPack Catalog.
View ArticleThe HP Proliant ZenPack (ZenPacks.zenoss.HP.Proliant), has been updated to 3.3.3. This maintenance release contains several fixes for various issues.Dependencies:* Zenoss >= 4.2* PythonCollector ZenPack* WBEM ZenPackChanges:- Updated icons with dark-theme compatible versions- ILO modeler timeout should follow zCollectorClientTimeout- Fix exceptions in impact relationship providers- Overhaul impact relationships- Hide credentials from logging (ZPS-4338)- Add support for ILO Gen 7 physical drives (ZPS-5274)- Fix WBEM component metric duplication issue (ZPS-6132)
Detailed information on this ZenPack is available from the ZenPack Catalog.
View ArticleControl Center 1.6.5
Control Center 1.6.5 is now Generally Available for customer download. For more information, see the release notes.
View ArticleZenoss Service Impact 5.5.1
Zenoss Service Impact 5.5.1 is now Generally Available for customer download. For more information, see the Release Notes.
View ArticleZenoss Resource Manager 6.4.0
Resource Manager 6.4.0 is now Generally Available for customer download. For more information, see the Release Notes.
View ArticleVersion 1.0.2 of the Google Cloud Platform ZenPack (ZenPacks.zenoss.GoogleCloudPlatform) has been released.
This release is compatible with Zenoss Cloud, Zenoss 6, and has no additional requirements.
The following changes have been made since the previous release: 1.0.1.
Fixing compatibility with DynamicView 1.8.0 and Impact 5.5. (ZPS-5666)
This release was tested with the following Zenoss versions.
Zenoss Cloud
Zenoss Resource Manager 6.3.2
Zenoss Service Impact 5.5.1
View ArticleVersion 1.11.0 of the PythonCollector ZenPack (ZenPacks.zenoss.PythonColletor) has been released.
This release is compatible with Zenoss Cloud, Zenoss 6, and has no additional requirements.
The following changes have been made since the previous release: 1.10.1.
Support extra datapoint tags in Zenoss Cloud. (ZPS-4587)
Support publishing ad hoc metrics from plugins. (ZPS-4629)
Provide better errors to collector when datamap application fails. (ZPS-5833)
Avoid disabling plugins due to clock jumps by using monotonic clock.
Allow testing of datasources with readable output. (ZEN-31038)
This release was tested with the following Zenoss versions.
Zenoss Cloud
Zenoss Resource Manager 6.3.2
View ArticleVersion 2.9.4 of the Microsoft Windows ZenPack (ZenPacks.zenoss.Microsoft.Windows) has been released.
This release is compatible with Zenoss versions 6.2 - 6.3 and and Zenoss Cloud, and has the following additional requirements.
ZenPacks.zenoss.PythonCollector 1.4 or higher
ZenPacks.zenoss.ZenPackLib 2.0.5 or higher
The following changes/fixes have been made since the previous release: 2.9.3
Fix Traceback seen for zenoss.winrm.Processes modeler plugin (ZPS-5676)
Fix Windows ZP: After upgrade to 2.9.3, seeing "caught returnValue being used in a non-generator" errors (ZPS-5727)
Fix Windows - cluster disks may not be modeled or monitored on 2008 cluster when using Powershell v2.0 (ZPS-5755)
Fix Unexpected Response --Encrypted Boundary during windows collection (ZPS-5767)
Document wincommand notification can be deprecated (ZPS-5501)
Fix Receiving 'NoneType' object has no attribute '__getitem__' when modeling Windows device (ZPS-5770)
Fix Perfmon command(s) did not start: 'NoneType' object has no attribute 'getConnection' (ZPS-5822)
Fix "The referenced context has expired" errors (ZPS-5797)
Fix WinRM device modeler fails to model system OS info if Win32_ComputerSystem doesn't return data (ZPS-5820)
Fix Windows collection attempts to use a dead connection causing a timeout (ZPS-5819)
Fix Cluster Monitoring Doesn't Account for Different Service Names (ZPS-5835)
Fix 'Get-Disk' is not recognized on Windows 2008 Clusters (ZPS-5822)
This release has been tested with Zenoss Cloud, Zenoss Resource Manager 6.3.2 and Service Impact 5.5.0. Detailed information on this ZenPack is available from the ZenPack Catalog.
View ArticleVersion 2.5.2 of the Calculated Performance ZenPack (ZenPacks.zenoss.CalculatedPerformance) has been released.
This release is compatible with Zenoss Cloud, 6.3 - 5.3, 4.2.5, and has the following additional requirements.
ZenPacks.zenoss.PythonCollector >= 1.9
The following changes have been made since the previous release: 2.5.1.
Remove unnecessary impact relationships for aggregation pools. (ZPS-5834)
This release was tested with the following Zenoss versions.
Zenoss Cloud
Zenoss Resource Manager 6.3.2
Detailed information on this ZenPack is available from the ZenPack Catalog.
View ArticleVersion 1.4.3 of the Layer2 ZenPack (ZenPacks.zenoss.Layer2) has been released.
This release is compatible with Zenoss Cloud, 6.3 - 5.3, 4.2.5, and has the following additional requirements.
ZenPacks.zenoss.PythonCollector >= 1.1
The following changes have been made since the previous release: 1.4.2
Improve performance of calculating network impact relationships. (ZPS-5712)
This release was tested with the following Zenoss versions.
Zenoss Cloud
Zenoss Resource Manager 6.3.2
Detailed information on this ZenPack is available from the ZenPack Catalog.
View ArticleVersion 4.0.2 of the VMware vSphere ZenPack (ZenPacks.zenoss.vSphere) has been released.This release is compatible with Zenoss versions 5.3 - 6.3 and Zenoss Cloud, and has the following additional requirements.
PropertyMonitor ZenPack 1.1.0 or later.
Enterprise Reports ZenPack (any version)
ZenPackLib ZenPack 2.1.0 or higher
The following changes have been made since the previous release: 4.0.1.
Fix invalid threshold for memory capacity on vSphere hosts (ZPS-4979)
Changing zVSphereModelVSAN property also switch monitoring templates (ZPS-5216)
Add support for NSX OpaqueNetworks (ZPS-5076)
Fix "vmodl.fault.MethodNotFound" error from vSAN monitoring against vCenter v6.0 (ZPS-5545)
Handle situation where VMware Tools return a guest IP address with spaces in it (ZPS-5536)
Fix sporadic "'NoneType' is not iterable" errors caused by model-monitored datapoints whose values are not updated frequently. (ZPS-5067)
Migrate Property datasources to vSphere Modeled in local copies of monitoring templates. (ZPS-5628)
Detailed information on this ZenPack is available from the ZenPack Catalog.
View ArticleVersion 2.3.3 of the Linux Monitor ZenPack (ZenPacks.zenoss.LinuxMonitor) has been released.
This release is compatible with Zenoss Cloud, 6.3 - 5.3, 4.2.5, and has the following additional requirements.
ZenPacks.zenoss.ZenPackLib (any version)
The following changes have been made since the previous release: 2.3.2.
Fix and optimize various impact relationship calculations. (ZPS-5664, ZPS-5711, ZPS-5792, ZPS-5806)
Fix "NotFound" modeling exception for snapshots of thin pools. (ZPS-5816)
This release was tested with the following Zenoss versions.
Zenoss Cloud
Zenoss Resource Manager 6.3.2
Detailed information on this ZenPack is available from the ZenPack Catalog.
View ArticleVersion 1.1.3 of the IBM Power ZenPack (ZenPacks.zenoss.IBM.Power) has been released.
This release is compatible with Zenoss Cloud, 6.3 - 5.3, 4.2.5, and thefollowing changes have been made since the previous release: 1.1.2.
Fix inconsistencies in Impact relationships
Do not attempt to set invalid LPAR relations (ZPS-5873)
Detailed information on this ZenPack is available from the ZenPack Catalog.
View ArticleApplies to
Zenoss 5.0
Summary
Docker uses the default 172.17.0.0/16 subnet for container networking. If this subnet is not available for docker in your environment (for example because your network already usesthis subnet), you must configure Docker to use a different subnet. You can perform this process across all the hosts in your system, or only on hosts deployed into environments where the 172.17.0.0/16 unavailable. In a multihost deployment, there is no requirement that all hosts usethe same subnet for Docker container communications.
Procedure
Stop the Resource Managerservices running on the host (for example the entire Resource Manager application if this procedure is being completed on a master server).
Shut down serviced and Docker on the host by typing the following on the host command line:
$ systemctl stop serviced
$ systemctl stop docker
Remove the existing MASQUERADE rules from the POSTROUTING chain in iptables:
iptables -t nat -F POSTROUTING
Remove the existing IP address from the Docker bridge device:
$ ip link set dev docker0 down
$ ip addr del 172.17.42.1/16 dev docker0
Pick a subnet you won't need to route to/from your collector. The /24 should be appropriate, unless you require more than 255 containers on a given host. The following example uses 192.168.5.0/24:
$ ip addr add 192.168.5.1/24 dev docker0
$ ip link set dev docker0 up
Verify that the interface has the correct IP set:
$ ipaddr show docker0
You should see a result similar to the following (the 'state DOWN' is expected at this stage):
docker0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN link/ether 56:84:7a:fe:97:99 brd ff:ff:ff:ff:ff:ff inet 192.168.5.1/24 scope global docker0 valid_lft forever preferred_lft forever
Start Docker:
$ systemctl start docker
Verify that the MASQUERADE rule for your new subnet has been added to the POSTROUTING chain:
$ iptables -t nat -L -n
As part of the response, you should expect to see the following for your Docker subnet:
Chain POSTROUTING (policy ACCEPT)
target prot opt source destination
MASQUERADE all -- 192.168.9.0/24 0.0.0.0/0
If you see those expected results, start serviced:
$ sytemctl start serviced
View Article
Applies To
Zenoss Resource Manager 5.x
Summary
It is often useful to run scripts or troubleshooting tools in docker containers. This KB describes the methods necessary to modify Resource Manager (RM) Docker containers to run scripts or tools and enable them to persist.
Because Docker containers are initiated from a base image, they are stateless. This means any modifications made to the container are lost upon restart unless those changes are incorporated into the image. Changes include adding or installing scripts and tools within the container.
The method used to modify the container to make the modifications available to the Zenoss RM depends on how permanent the changes need to be. The choices for permanency include:
Temporary - the script or tool exists only until the container restarts
Semi-permanent - the script or tool is replicated to all other containers and persists on container restart but is lost at next upgrade
Permanent - the script or tool persists following upgrade (Note: although these will be available to any host, there are exceptions depending on use case)
For additional information on containers, see the KB article titled Virtualization and Docker Containerization for Poets.
Procedures
The following procedures explain how to copy tools or scripts into a container and make them persist, depending on the permanency requirements.
Note: The use case described in this KB suggests the zope container because toolbox requires access to zodb. The actual target will vary based on your use case. Contact Zenoss support if you are unsure which service needs access to the custom tool.
Temporary Tool or Script
This procedure makes the tool or script available only temporarily, it will not persist when the container restarts or following an upgrade.
To make the tool or script persist until an upgrade (semi-permanent), proceed instead to the next section, Semi-Permanent Tool or Script.
To make the tool or script persist following an upgrade, proceed instead to the next section, Permanent Tool or Script.
Move the tool or script into the container from a server running SSH:
Become root if necessary:
sudo - su
Attach to the container, for example:
serviced service attach zope/0
Become the zenoss user:
su - zenoss
SCP the file into the container. For example toolbox.zip:
scp user@remote_server:~/toolbox.zip .
NOTE: It is also possible to use rsync to copy the files to the host..
Execute the script or install the tool. NOTE: Some tools might require installation. For example to install the toolbox utility:
Change into the directory where the toolbox utility is located.
Install the toolbox utility:
easy_install zenoss_toolbox.zip
Semi-Permanent Tool or Script
The following procedure makes the tool, script or utility semi-permanent in the container(s). This means it is replicated to all other containers and persists upon container restart but is lost at the next upgrade.
To make the tool or script persist following an upgrade, proceed instead to the next section, Permanent Tool or Script.
Shell into a new container and move the tool or script into the container from a server running SSH:
Become root if necessary:
sudo - su
Shell into a new container to create it based on an existing container. For example, spawn the new container from the existing zope container:
mynew=InstallToolboxserviced service shell -s $mynew -i zope
Become the zenoss user:
su - zenoss
SCP the file into the container. For example, toolbox.zip:
scp user@remote_server:~/toolbox.zip .
NOTES:
It is also possible to use rsync to copy the files to the host.
Docker enables a feature that files located in the directory where the new container is spawned from are also available within the container directory /mnt/pwd directory.
Execute the script or install the tool. NOTE: Some tools might require installation. For example to install the toolbox utility:
Change into the directory where the toolbox utility is located.
Install the toolbox utility:
easy_install zenoss_toolbox.zip
Exit the zenoss user:
exit
Exit the container:
exit
Commit the container where the changes have been made, for example commit the mynew container:
serviced snapshot commit $mynew
Restart the service so the change becomes effective in all containers, for example, zope:
serviced service restart zope
Permanent Tool or Script
The following procedure makes tools or scripts permanent in the container(s) by placing them within a special DFS directory. This special directory mounts to the containers and survives upgrade. This means any tool or script installed in this directory is replicated to all other containers, persists upon container restart and survives upgrade (Note: only for single host instances of 5.x)
Shell into a new container:
Become root if necessary:
sudo - su
Shell into a new container to create it based on an existing container. For example, spawn the new container from the existing zope container:
mynew=InstallToolboxserviced service shell -s $mynew -i zope
Become the zenoss user:
su - zenoss
Create a directory within the DFS directory /opt/zenoss/var/ext for your tool or script. For example:
mkdir /opt/zenoss/var/ext/my_new_dir
Copy or install your tool or script into the new directory. For example:
Change into the new directory:
cd /opt/zenoss/var/ext/my_new_dir
Copy or install the tool or script into that directory. For example:
scp user@remote_server:~/toolbox.zip .
easy_install toolbox.zip
Exit the zenoss user:
exit
Exit the container:
exit
Commit the container where the changes have been made, for example commit the mynew container:
serviced snapshot commit $mynew
Restart the service so the change becomes effective in all containers, for example, zope:
serviced service restart zope
Adding the tool or script to the special DFS directory ensures it will persist following an upgrade.
After an Upgrade
Following an upgrade, it is necessary to:
Reapply patches (via quilt for example)
Retrieve any stored customized scripts from the /opt/zenoss/var/ext/my_new_dir
Reinstall any previously installed tools from the /opt/zenoss/var/ext/my_new_dir
View ArticleVersion 2.3.2 of the Linux Monitor ZenPack (ZenPacks.zenoss.LinuxMonitor) has been released.
This release is compatible with Zenoss Cloud, 5.3 - 6.2, 4.2.5, and has the following additional requirements.
ZenPacks.zenoss.ZenPackLib (any version)
The following changes have been made since the previous release: 2.3.1.
Guard against out of date sudoers configuration in service monitoring. (ZPS-4334)
Allow filesystem modeling and monitoring to work with or without sudo access. (ZPS-4340)
Fix LVM monitoring when */sbin not in user's path. (ZPS-4349)
Fix undocumented sudo usage of "systemctl status". (ZPS-4121)
Update reduced recommended sudoers configuration. (ZPS-4121)
This release was tested with the following Zenoss versions.
Zenoss Cloud
Zenoss Resource Manager 6.2.1
Zenoss Resource Manager 5.3.3
Detailed information on this ZenPack is available from the ZenPack Catalog.
View ArticleApplies To
Zenoss Resource Manager 5.x
Summary
This KB is designed to be a quick primer on datasource creation. It uses the case of monitoring an SSL certificate as the example.
Procedure
The following procedure describes how to create a new data source:
From within the RM GUI, make a local template. Click the gear icon on the bottom left to launch the new template dialog.
Name the new template, any name you like.
Create a new datasource. Click the plus (+) icon at the top and select command from the list. In this example it is named ssl. The Edit Data Source dialog displays.
Enter the following into the Command Template field:
/bin/bash -c "echo 'OK|days='$$(expr ( $$(echo | openssl s_client -servername ${device/manageIp} -connect ${device/manageIp}:443 2>/dev/null | openssl x509 -noout -dates | grep -v notBefore | date --date="$$(sed 's/notAfter=(.)./\1/')" +'%s') - $$(date +%s) ) / 60 / 24 )"
(Click for larger image)
Complete any fields that you require.
Click SAVE to save and close the dialog.
Command Notes
Dollar signs ($) in Commands
Commands that would normally have a single dollar sign ($), such as a nested argument like the first $(expr) require two dollar signs, $$. This is because the dollar sign is used in the "TALES" engine. For more information about TALES, consult the Appendix of the Administration Guide. An actual TALES variable, ${device/manageIp}, is shown in the example code (above) with a single dollar sign that inserts the monitored device's IP. Anything that is not a TALES variable that contains a dollar sign must be escaped so it does not attempt to interpret the section after.
Use of /bin/bash Before Command
The command is prefixed with /bin/bash to initialize paths and other system variables of the command daemon. Because a shell isn't ordinarily used, if /bin/bash is not used before the command, typing in a command such as echo creates an error because echo is defined in the path of the target operating system. After initializing with /bin/bash it is no longer necessary to use full paths for any of the commands, such as echo, expr, sed, etc.
SSH Checkbox
The use ssh checkbox toggles between running this command on the monitoring server versus running the command on the target server. In this case it is left unchecked because google.com is accessible from the monitoring server.
If it is necessary to run a command that only works on the target server (for example, verify that a specific networking path is open), the check box must be ticked to run it from the other server.
Pad the Data
In the example, the data is padded with nagios format. This is because if the entire command is put into a shell, the result is something like:
OK|days=1234567890
You cannot use a command that only provides a number such as 1234567890 because the system does not understand how to use it. This is because it is possible to have multiple datapoints coming from a single command. You must label the datapoint with the command's output so it will be matched to the name of the datapoint you create in RM. For example:
(Click for larger image)
Set Cycle Time
There is no need to set the cycle time to 300 (5 minutes). Because this value is updated daily, a different value might work better, such as 86400 - the number of seconds in a day.
Use Scripts
Ideally, scripts should be used that provide data even when there is no data to provide.
For example, if SSL is not installed on the target device, the associated variables can be set to 9999.
As a second example, if you want to set a datapoint called present
Create a new datapoint called present under the datasource within RM.
Set it to return something like OK|days=1234567890 present=1 .
It is possible to put logic in, for example, "if this server is running nginx, check for an ssl; then populate the data accordingly".
The commands depend on your particular environment. You are free to run anything that works within the target environment and limited only by your imagination.
View ArticleApplies To
All Versions
Summary
Administrators must choose an approach for organizing their devices in Zenoss and remain consistent in implementing that approach as devices are added. Zenoss offers the following device organizers:
Device classes - Provided out-of-the-box with Zenoss . Enables administrators to organize devices by device type. Additional classes can be created at any time. For example, Zenoss includes a device class called /Devices/Server/Linux with the intention that administrators will place Linux servers in this class. Administrators may choose to add sub-classes to the /Devices/Server/Linux class. However, placing devices in particular device class has greater implications for Zenoss monitoring than merely grouping them logically. This KB examines these implications in greater detail and makes recommendations about when new classes should and should not be created.
Groups - Device organizers can be created by administrators as needed.
Systems - Device organizers can be created by administrators as needed.
Locations - Device organizers can be created by administrators as needed.
Organizing Devices: A Short Case Study
To fully examine the different approaches administrators can take when organizing devices, consider a fictitious example: Hypothetical Insurance Company. Hypothetical has offices in several cities and began monitoring their devices using Zenoss in Annapolis, Maryland and Washington, DC. The company also has several internal departments, and wants to administer the IT resources of these departments as distinct collections. Their most important departments include Underwriting, Business Development, and Accounting. Hypothetical Insurance could choose from two broad approaches for organizing these devices:
Method #1 (Recommended): Organize Devices by Creating Groups, Systems and Locations Organizers
Hypothetical should make use of the Groups Systems and Locations organizers provided specifically for the purpose of logically grouping devices. Hypothetical can create their own organizers as required and specify which devices are associated with each. For example, they can create:- Washington and Annapolis Locations organizersand - Underwriting, Business Development, and Accounting Groups organizers They can then move systems into these organizers as appropriate. Although it appears that devices have physically moved when they are dragged-and-dropped to these organizers within the user interface, the devices aren't actually moved within the Zenoss database. Rather, the Zenoss software sets "Groups" "Systems" and "Locations" organizers as attributes of devices. As described below, this distinction has important implications for Zenoss' operation "under the hood".
Method #2 (Not Recommended): Create Device Classes to Organize Devices
Alternatively, Hypothetical could add child classes to each device class to organize their devices. For example, they could create the following child classes of /Devices/Server/Linux/ to organize their Linux servers:
/Devices/Server/Linux/Annapolis
/Devices/Server/Linux/Washington
These child organizers could be further expanded to separate systems in different departments:
/Devices/Server/Linux/Annapolis/Underwriting
/Devices/Server/Linux/Washington/Business_Development
These steps could be repeated for each device class Hypothetical uses.
Creating device classes to organize devices is not recommended for two primary reasons: 1) Creating devices classes to organize devices can make it very difficult for administrators to find the devices they are looking for. For example, if someone needed to find all of the servers (of any type) in the Underwriting department, he or she would need to hunt through all of the child classes under /Server, identifying and writing down the names of the devices they are looking for. The broader the administrator's query, the more severe this problem becomes.2) Creating classes to logically group devices unnecessarily increases the complexity of the Zenoss back end database. This increased complexity can contribute to performance degradation. Unlike adding "Groups" or "Systems" organizers - that are attributes defined for devices - a new device class constitutes a new location within the database with its own set of attributes and relationships with other database objects.When practical, it is preferable to avoid the introduction of the increased database complexity associated with the creation of new classes unless it is necessary for custom monitoring of a subset of devices. Because Hypothetical is creating organizers only for the purpose of logically organizing their devices, with no changes in the monitoring or modeling specifics for subsets of devices, they should choose Method #1 above and make use of "Groups," "Systems" and "Locations" organizers.
When Hypothetical Should Create New Device Classes
New device classes should be created when one or more of the following need to be unique for a subset of devices:
monitoring template bindings
modeler plugin associations
configuration properties
Consider a concrete example at Hypothetical Insurance. For a subset of Windows servers, system administrators at Hypothetical have decided to fetch server logs related to print spooler failures so they appear in the Zenoss event console. Their method for doing this is to add an additional data source to the Windows server template, but in order to prevent their event console from being clogged up with entries from servers they are unconcerned with, they only want to apply the modified template to a subset of their Windows servers. This scenario would be a perfect use case for the creation of a new sub class of the /Server/Microsoft/Windows device class. Here are the steps Hypothetical administrators would take to create the new class:
Create the new Windows sub-class. For example: /Server/Microsoft/Windows/Print_Logging/
Create a copy of the standard Windows monitoring template for the new class ("copy/override" the template)
Add the additional data point to the new template
Move the target servers into the new class
Note that no changes will be made to the target servers' "Groups," "Systems" or"Locations" organizer settingsby moving the target servers to the new sub class. When and if the need to fetch the print spooler events from one or more of the servers ends, they can be moved back to their original device class.
View ArticleApplies To
Zenoss 5.x
Summary
Hardware failure in Control Center can take various forms, including:
Running out of disk space on one or more of the partitions that store Control Center, Docker or Zenoss data.
Power failure on a Control Center host.
In either case, data might not have been written to disk, leaving your system in an unusable state.
Symptoms
The symptoms of low disk space or power failure include system instability, data loss, and log file entries.
Procedures
The following sections describe some possible hardware failure results and their associated remediation steps.
How to Check Diskspace
The normal du/df commands do not provide useful information when applied to logical volumes managed by the Logical Volume Manager (LVM). This is because there are volume groups that consist of physical volumes that are configured into logical volumes. Use the following LVM-specific commands to get more information.
Identify Volume Groups
To determine which volume groups exist, issue the following command:
# vgs
Display Volume Group Information
To determine the volume group total space, how much free space exists within the volume group and how much space is allocated to logical volumes, issue the following command:
# vgdisplay [vg_name]
Identify Physical Volumes
To determine which physical volumes make up the volume group and which logical volumes exist within the volume group, include -v:
# vgdisplay -v [vg_name]
Display Logical Volume Sizes
To display the size of logical volumes:
# lvs [-a] [--units hHbBsSkKmMgGtTpPeE]
Note: The lvs command can output in units you choose. From the lvs manpage :
--units hHbBsSkKmMgGtTpPeE
All sizes are output in these units: (h)uman-readable, (b)ytes,
(s)ectors, (k)ilobytes, (m)egabytes, (g)igabytes,(t)erabytes,
(p)etabytes, (e)xabytes. Capitalise to use multiples of 1000 (S.I.)
instead of 1024. Can also specify custom units e.g. --units 3M
For additional information on the Logical Volume Manager, see the Red Hat Enterprise Logical Volume Manager Administration, LVM Administrator Guide located at: https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/6/html/Logical_Volume_Manager_Administration/index.html.
Recovery Steps For Various Scenarios
The Docker Filesystem (/var/lib/docker) Out Of Space
If /var/lib/docker has no available disk space, delete existing containers to free up space, for example:
Shut down serviced.
Delete all existing containers:
docker rm $(docker ps -qa)
If some Docker metadata wasnt fully written to disk, problems can manifest in various ways, for example:
The Docker daemon could refuse to start. This is often caused by the presence of one or more zero-length files in /var/lib/docker (specifically, /var/lib/docker/trust/official.json and /var/lib/docker/repositories-btrfs are known offenders). You can safely delete these files and restart Docker to recover from this.
The serviced daemon could fail to start one or more internal services, logging API Error (500). The Docker logs will show a more specific error. This has, so far, only been seen on Docker versions less than 1.6.0. This occurs when Dockers own internal graph database is corrupted. The Docker logs reported, for example Cannot find child for /serviced-isvcs_logstash. To correct this issue, perform the following:
Shut down serviced (it should shut down on its own anyway)
Run the command docker ps -a to display all (stopped) containers remaining on the system. Several of them might have no status or name. This is the problem.
Remove all containers:
docker rm $(docker ps -aq)
Start serviced
Corruption Of Control Center Zookeeper (/opt/serviced/var/isvcs/zookeeper) Files
The most likely effect of Zookeeper becoming corrupted is that services will not start, or will not start correctly. They might also report bad networking imports. Virtual hosts might not work properly. This data can be rebuilt. Perform the following:
Shut down serviced
Remove the directory:
sudo rm -rf /opt/serviced/var/isvcs/zookeeper
Delete all existing containers:
docker rm $(docker ps -qa)
Start serviced
Restart serviced on all remote hosts
Corruption of Control Center HBase/OpenTSDB (/opt/serviced/var/isvcs/opentsdb) Files
If the internal HBase becomes corrupted, it may be possible for it to recover by itself. It will attempt this on startup, and log accordingly. The default heap settings may, on very large systems, be inadequate for this recovery process. Youll know because it will keep shutting itself down with an error indicating its out of heap. You can increase the heap temporarily:
Attach to the running container:
docker exec -it serviced-isvcs_opentsdb bash
Modify the max heap size:
echo "export HBASE_HEAPSIZE=2048" >> /opt/hbase*/conf/hbase-env.sh
Restart HBase:
supervisorctl -c /opt/zenoss/etc/supervisor.conf
restart hbase
These settings will be reverted when serviced is restarted. If HBase repairs its corruption successfully, it will start normally. If the repair fails, the HMaster logs will indicate this and you might need to proceed with additional HBase recovery, for example:
Attach to the running container:
docker exec -it serviced-isvcs_opentsdb bash
Run the HBase repair tool:
JAVA_HOME="/usr/lib/jvm/java-7-openjdk-amd64" HBASE_HOME=/opt/hbase-0.94.16 /opt/hbase-0.94.16/bin/hbase hbck -fix
If HBase is corrupted beyond repair, you might need to remove the existing data to allow it to start. Perform the following:
On the master host, stop all processes in the internal metrics container:
docker exec -it serviced-isvcs_opentsdb supervisorctl -c /opt/zenoss/etc/supervisor.conf stop all
Remove the HBase data:
rm -rf /opt/serviced/var/isvcs/opentsdb/hbase/.*
Start the metrics processes:
docker exec -it serviced-isvcs_opentsdb supervisorctl -c /opt/zenoss/etc/supervisor.conf start all
Corruption of Zenoss RabbitMQ (/opt/serviced/var/volumes/*/rabbitmq) Files
If RabbitMQ data has become corrupted, Rabbit will be unable to start. It is possible that it will start and some processes will be unable to connect. If this happens, you should remove the existing data. Any messages that were in the queues when the hardware failure occurred will be lost.
Stop the RabbitMQ service:
serviced service stop rabbitmq
Delete the RabbitMQ data:
export SERVICE_ID=$(serviced service status Zenoss.resmgr | sed -n '2p' | awk {'print $2'})
export SVCROOT=/opt/serviced/var/volumes/$SERVICE_ID
rm -rf $SVCROOT/rabbitmq
rm -f $SVCROOT/.rabbitmq.serviced.initialized
Restart RabbitMQ:
serviced service start rabbitmq
Corruption of Zenoss HBase (/opt/serviced/var/volumes/*/hbase-master) Files
HBase can be recovered in a similar way as the internal HBase. Perform the following
Attach to the running container:
serviced service attach hmaster
Run the HBase repair tool:
su - hbase -c hbase hbck -fix
If HBase is corrupted beyond repair, you may need to remove the existing data to allow it to start.
Warning: Performing the following steps removes all performance data.
To remove existing data to enable HBase to start, perform the following:
On the master host, stop all HBase and OpenTSDB processes:
serviced service stop hbase serviced service stop opentsdb
Remove the HBase data:
export SERVICE_ID=$(serviced service status Zenoss.resmgr | sed -n '2p' | awk {'print $2'})
export SVCROOT=/opt/serviced/var/volumes/$SERVICE_ID
sudo rm -rf $SVCROOT/hbase-*
sudo rm -rf $SVCROOT/.hbase-*.serviced.initialized
Start HBase and OpenTSDB:
serviced service start hbase serviced service start opentsdb
Corruption of Zenoss Zookeeper (/opt/serviced/var/volumes/*/hbase-zookeeper-*) Files
It is possible, though not particularly likely, for the Zookeeper instance(s) used for HBase to become corrupted on hardware failure. Recovery consists of removing the corrupted data:
Stop HBase:
serviced service stop hbase
Delete the Zookeeper data:
export SERVICE_ID=$(serviced service status Zenoss.resmgr | sed -n '2p' | awk {'print $2'})
export SVCROOT=/opt/serviced/var/volumes/$SERVICE_ID
sudo rm -rf $SVCROOT/hbase-zookeeper-*
sudo rm -rf $SVCROOT/.hbase-zookeeper-*.serviced.initialized
Restart HBase:
serviced service start hbase
Corruption of Event Indexes (/opt/serviced/var/volumes/*/zeneventserver/index)
If the Lucene-based event index becomes corrupted, zeneventserver will automatically rebuild it oncethe corrupted index data is removed. Perform the following:
Stop zeneventserver:
serviced service stop zeneventserver
Delete the index data:
export SERVICE_ID=$(serviced service status Zenoss.resmgr | sed -n '2p' | awk {'print $2'})
export SVCROOT=/opt/serviced/var/volumes/$SERVICE_ID
rm -rf $SVCROOT/zeneventserver/index
Start zeneventserver:
serviced service start zeneventserver
The zeneventserver log will indicate the indexes are being rebuilt.
Corruption of Catalog Service Indexes (/opt/serviced/var/volumes/*/zencatalogservice)
If the Lucene-based model index becomes corrupted, you must remove the data and rebuild the catalog. Perform the following:
Stop zencatalogservice:
serviced service stop zencatalogservice
Remove the catalog data:
export SERVICE_ID=$(serviced service status Zenoss.resmgr | sed -n '2p' | awk {'print $2'})
export SVCROOT=/opt/serviced/var/volumes/$SERVICE_ID
rm -rf $SVCROOT/zencatalogservice
rm -rf $SVCROOT/.zencatalogservice.serviced.initialized
Start zencatalogservice:
serviced service start zencatalogservice
Withzencatalogservice started, rebuild the catalog: (Note, because this can require time to complete, perform this in a screen session)
serviced service attach zope/0 su - zenoss -c "zencatalog run --createcatalog --forceindex"
Corruption of Redis (/opt/serviced/var/volumes/*/redis) Files
If hardware failure occurs in the middle of a snapshot, redis may write an incomplete or zero-length database dump to disk. The redis server will be unable to start and logs will show that it is unable to load the database. Additionally, serviced will continually attempt to restart it. Recovery requires deleting the bad snapshot file. (NOTE: any in-flight data in Redis at the time of hardware failure will be lost).
Attach to the redis container:
serviced service attach redis
Check the logs to verify the issue:
tail -f /var/log/redis/redis.log
If this is the problem, delete the bad snapshot:
rm -f /var/lib/redis/dump.rdb
Redis should recover when it is restarted by serviced (usually within 10 seconds).
View ArticleVersion 2.1.0 of the Ceph ZenPack (ZenPacks.zenoss.Ceph) has been released.
This release is compatible with Zenoss 6.2 - 6.3 and Zenoss Cloud, and has the following additional requirements
ZenPacks.zenoss.PythonCollector >= 1.6.1
The following changes have been made since the previous release: 2.0.1
Add backend_addr to host components (ZPS-281)
Add sudo support for Docker containers (ZPS-5736)
Use sudo fallback for .asok sockets in modeler (ZPS-5246)
Change Pool IO to rate instead of gauge (ZPS-3039)
Clarify and enhance docs for sudoers (ZPS-270)
Clarify docs on installing SSH device (ZPS-5742)
Fix pre-Luminous health-check errors (ZPS-5738)
Guard against missing d.cephProxyComponentUUID in DeviceProxy (ZPS-4945)
This version has been tested with Zenoss Cloud, Zenoss 6.3.2 and Service Impact 5.3.4. Detailed information on this ZenPack is available from the ZenPack Catalog.
View ArticleVersion 1.0.0 of the DMTF Redfish ZenPack (ZenPacks.zenoss.Redfish) has been released.
This release is compatible with Zenoss 6.2 - 6.3 and Zenoss Cloud, and has the following additional requirements
ZenPacks.zenoss.PythonCollector
ZenPacks.zenoss.ZenPacklib 2.1.0 or higher
This is the first release of the Redfish ZenPack.
Detailed information on this ZenPack is available from the ZenPack Catalog.
View ArticleMonitoring a Microsoft Cluster
The first step is to add the virtual hostname of the cluster to Resource Manager in the /Server/Microsoft/Cluster class path. The cluster nodes will automatically be added to /Server/Microsoft/Windows class path. Once modeled, any cluster nodes associated with the cluster are added to device class /Server/Microsoft/Windows. The only three valid modeler plugins for device class /Server/Microsoft/Cluster are WinCluster, OperatingSystem, and WinMSSQL.
Do not add the WinCluster plugin to the /Server/Microsoft/Windows device class. The /Server/Microsoft/Windows device class uses a python class that is different from the python class used in the /Server/Microsoft/Cluster device class, and they have different relationships.
Monitoring MSSQL
When monitoring MSSQL Enterprise edition using Fail Over mode the WinMSSQL modeler plugin should only be on the cluster hostname. For Single Server mode the WinMSSQL modeler plugin is installed on the physical node running MSSQL. It is not possible to monitor a Fail Over mode from a single host in /Server/Microsoft/Windows.
View ArticleVersion 3.0.0 of the Cisco UCS ZenPack (ZenPacks.zenoss.CiscoUCS) has been released. This is a major release that brings capacity related features and other enhancements into the ZenPack.
This release is compatible with Zenoss versions 6.2 - 6.3 and Zenoss Cloud, and has the following additional requirements.
CiscoMonitor ZenPack >= 4.0
Calculated Performance ZenPack >= 2.5.1
Dashboard ZenPack
DynamicView ZenPack
Predictive Threshold ZenPack
PythonCollector ZenPack
ZenPackLib ZenPack >= 2.1
Note: This version of CiscoUCS ZenPack fully replaces UCSCapacity - If you are using UCSCapacity ZenPack in your Zenoss installation, it should be upgraded to 2.0.0 before upgrading CiscoUCS 2.x to CiscoUCS 3.0.0. UCSCapacity can be removed after CiscoUCS has been upgraded.
The following changes have been made since the previous release: 2.8.0.
Add all UCSCapacity features
Upgrade to ZenPackLib 2.x
Fix "Edit with Domain Designer" issue on Cisco UCS Manager Domain. (ZPS-1671)
Fix massively growing counts for Cisco UCS faults. (ZPS-3202)
Do not autogenerate CiscoUCS reports while opening. (ZPS-4175)
Show a spinner wheel when generating Storage reports. (ZPS-4305)
Ensure CIMC session closing after collection when zCiscoUCSCIMCReuseSessions is false. (ZPS-4555)
Guard against empty object maps in modeler. (ZPS-5617)
Allow user specified severity fields for UCS Manager events. (ZPS-5199)
Fix slider in Dependency View to update resources by utilization. (ZPS-5643)
Update Topology View for S3260 storage chassis. (ZPS-3576)
This version has been tested with Zenoss Cloud, Zenoss 6.3.2 and Service Impact 5.3.4. Detailed information on this ZenPack is available from the ZenPack Catalog.
View ArticleVersion 4.0.0 of the NetApp Monitor ZenPack (ZenPacks.zenoss.NetAppMonitor) has been released.
This release is compatible with Zenoss versions 6.2 - 6.3 and Zenoss Cloud, and has the following additional requirements.
ZenPacks.zenoss.PythonCollector >= 1.8.0
ZenPacks.zenoss.StorageBase >=1.4.3
ZenPacks.zenoss.CalculatedPerformance (any version)
ZenPacks.zenoss.ZenPackLib >= 2.1
The following changes have been made since the previous release: 3.7.0.
Add SNMP datapoints and new graphs on device level and for LUN and HardDisk components (ZPS-2451)
Add support for monitoring events of OnCommand Unified Manager (ZPS-2961)
Convert monitoring templates to ZPL format (ZPS-4326)
Convert NetApp components to ZPL format (ZPS-4326)
Extend support of SNMP monitoring with additional datapoints and events (ZPS-2451)
Fix thresholds for inode usage for offline volumes (ZPS-5394)
Improve Impact and Dependency View relations (ZPS-4326)
Move frequently changing metrics from modeling to monitoring (ZPS-5114)
This version has been tested with Zenoss Resource Manager 6.3.2, Zenoss Cloud and Service Impact 5.3.4. Detailed information on this ZenPack is available from the ZenPack Catalog.
View ArticleVersion 1.0.2 of the Nutanix ZenPack (ZenPacks.zenoss.Nutanix) has been released.
This release is compatible with Zenoss 6.2 - 6.3 and Zenoss Cloud, and hasthe following requirements.
ZenPacks.zenoss.ZenPackLib 2.0.8 or higher
ZenPacks.zenoss.PythonCollector
The following changes have been made since the previous release: 1.0.1.
Fix minor log warnings (ZPS-5620)
Fix incorrect datapoint for vm/cvm memory_usage_pct (ZPS-2904)
Fix unhandled plugin errors while monitoring (ZPS-3428)
Fix modeling when cluster uuid missing from API (ZPS-4189)
This release has been tested with Zenoss Cloud, Zenoss 6.3.2 and Service Impact 5.3.4. Detailed information on this ZenPack is available from the ZenPack Catalog.
View ArticleVersion 2.9.3 of the Microsoft Windows ZenPack (ZenPacks.zenoss.Microsoft.Windows) has been released.
This release is compatible with Zenoss versions 6.2 - 6.3 and and Zenoss Cloud, and has the following additional requirements.
ZenPacks.zenoss.PythonCollector 1.4 or higher
ZenPacks.zenoss.ZenPackLib 2.0.5 or higher
The following changes/fixes have been made since the previous release: 2.9.2
Fix deprecated Get-WmiObject cmdlet for PowerShell Core (ZPS-4927)
Fix Windows Perfmon data collection stops for long time after device reboot (ZPS-4473)
Fix Windows - No freespace on cluster shared volumes (ZPS-4612)
Fix Better handling in Perfmon datasource of "is not recognized as the name of a cmdlet" errors (ZPS-3517)
Fix 500 Operation Timeout Errors when modeling and/or monitoring SQL Server (ZPS-4638)
Fix Windows Cluster - wrong ip address can be returned for sql server node (ZPS-4703)
Fix Windows Cluster - sql server fails over doesn't trigger a remodel if cluster group stays the same (ZPS-4707)
Fix Windows Perfmon data collection stops for long time after collection interruption (ZPS-4473)
Fix Windows may connect to device with wrong zWinRMUser (ZPS-3564)
Fix Microsoft.Windows - Detect .NET version better for EventLogDataSource (ZPS-5399)
Fix Windows Disconnected Network Drives that Cause PowerShell Error (ZPS-4866)
Fix After upgrade to 2.9.x, some/most modeler plugins fail with "'NoneType' object has no attribute 'getConnection'" (ZPS-5087)
Fix Windows Cluster - clusterOwnerChange may not be in zWindowsRemodelEventClassKeys after install/upgrade (ZPS-4887)
Fix Windows Cluster - SQL Server instance metrics may not be found (ZPS-4888)
Fix WindowsServiceLog "The referenced context has expired" error (ZPS-3216)
Fix Add ERROR handling for empty win32_SystemEnclsoure data (ZPS-5253)
Fix Windows devices monitored over https regularly fail collection (ZPS-5323)
Fix Increase Flexibility in Microsoft ZenPack for Data Source using Microsoft's Event Log (ZPS-5585)
Fix Windows Cluster - Missing or no data returned when querying job events do not clear after failover remodel (ZPS-4874)
This release has been tested with Zenoss Cloud, Zenoss Resource Manager 6.3.2 and Service Impact 5.3.4. Detailed information on this ZenPack is available from the ZenPack Catalog.
View ArticleVersion 5.10.0 of the Cisco Devices ZenPack (ZenPacks.zenoss.CiscoMonitor) has been released.
This release is compatible with Zenoss versions 6.2 - 6.3 and Zenoss Cloud, and has the following additional requirements.
ZenPacks.zenoss.PythonCollector >= 1.4
ZenPacks.zenoss.ZenPackLib >= 2.0.5
The following changes have been made since the previous release: 5.9.0.
Add support for ASA VPN tunnels. (SVC-2005)
Add missing hardware models. (ZPS-3515)
Add Impact policies to CiscoDevice for Fans, PSU, and Temperature Sensor. (ZPS-3972)
Handle unknown StandbyState in cHsrpStateChange. (ZPS-3644)
Improve performance of snmp trap transform.
Make network interface descriptions searchable. (ZPS-3949)
Fix for "invalid literal for int()" errors in ISDN monitoring. (ZPS-3698)
Fix for "AttributeError: vnis" in zenmapper.log. (ZPS-3965)
Do not autogenerate Cisco Inventory report. (ZPS-4204)
Shows a spinner wheel when generating Cisco Inventory report. (ZPS-4253)
Add Device-Modern template and change current templates for ASA Devices. (ZPS-3598)
This version has been tested with Zenoss 6.3.2 and Zenoss Cloud.Detailed information on this ZenPack is available from the ZenPack Catalog.
View Article