Zenoss FAQs | Comparably
Zenoss Claimed Company
Zenoss is a leading provider of unified monitoring and analytics software for physical, virtual, and cloud-based IT infrastructures. read more
EMPLOYEE
PARTICIPANTS
10
TOTAL
RATINGS
178

Zenoss FAQs

Zenoss's Frequently Asked Questions page is a central hub where its customers can always go to with their most common questions. These are the 581 most popular questions Zenoss receives.

Frequently Asked Questions About Zenoss

  • Applies To

    Zenoss 5.x

    Zenoss 6.x

    Summary

    When it becomes necessary to perform maintenance on Zenoss Resource Manager (RM), it is useful to perform controlled, incremental shutdowns and startups for Control Center (CC) and Resource Manager. This KB discusses the best practices for performing these tasks.

    Procedures

    Controlled Shutdown Procedure

    Stop RM and top-level applications.

    The RM application shouldbe stopped before stopping Control Center. This will allow for a controlled startup of RM services after Control Center is restarted.

    Use the following command to stop RM:

    # serviced service stop Zenoss.resmgr

    Wait for all of the applications to stop. To determine if the applications are stopped, watch the status of services until all have a state of Stopped. The following command lists every service that is NOT stopped. It is safe to ignore services that have no status - these are the folder services that do not run anything:

    # serviced service status | grep -v Stopped

    Stop the CC master:

    # systemctl stop serviced

    Stop all resource pool workers (formerly known as CC agents).

    On each resource pool worker node, stop serviced:

    # systemctl stop serviced

    Remove stray docker containers.

    NOTE: Perform this step on both the master and the resource pool worker nodes.

    Although stopping serviced is normally enough to stop all docker containers on the node, sometimes stray containers remain. Because stray containers can prevent NFS from unmounting in the next step, ensure all containers are stopped and no strays remain.

    Determine if there are remaining containers, including those in an Exited status:

    # docker ps -a

    If any containers remain, use the following command to remove them:

    # docker ps -qa | xargs --no-run-if-empty docker rm -fv

    rm Command Hangs

    In some edge cases, that last command can hang because a container will not die, most frequently because of an NFS hang. To resolve the issue and stop the container, perform the following:

    Stop NFS:

    # systemctl stop nfs

    Stop docker:

    # systemctl stop docker

    Start NFS:

    # systemctl start nfs

    Start docker:

    # systemctl start docker

    Kill remaining container(s):

    # docker ps -qa | xargs --no-run-if-empty docker rm -fv

    NOTE: If a container continues to persist, a last resort is to reboot the resource pool worker node.

    Unmount all resource pool NFS mount points.

    NOTE: This step applies to the resource pool worker nodes.

    When serviced and the docker containers are stopped, umount the NFS mount points. This prevents the possibility of any stale NFS mount errors on the resource pool worker nodes in cases where the masters storage has to be completely replaced.

    Check for active mounts:

    # grep serviced /proc/mounts

    A typical agent-side mount has the format

    <ccMasterIP>:/serviced_volumes_v2/<tenantID> /opt/serviced/var/volumes/<tenantID>

    For example:

    1.8.1.4:/serviced_volumes_v2/erjcdpennn9yqfcg82hfso6q7 /opt/serviced/var/volumes/erjcdpennn9yqfcg82hfso6q7

    If there are no serviced mounts, umounting is complete. If mounts exist for serviced, they must be removed. Proceed to using the force option in step b.

    Note: If the master has multiple tenant applications, there can be multiple mounts.

    Use the force option to unmount the volumes:

    # umount -f /opt/serviced/var/volumes/<tenantID>

    Confirm the mount point is removed. Consult /proc/mounts. Note that in some edge cases, a simple umount does not work and the workaround is to do a lazy unmount and restart NFS:

    # umount -f -l /opt/serviced/var/volumes/<tenantID>

    # systemctl restart nfs

    If the umount still fails, reboot the resource pool worker node.

    Controlled Startup Procedure

    This procedure helps avoid a chaotic startup in a large production environment. This enables the rapid isolation and resolution of any problems that appear during startup/restart. In terms of order of operations, there are many variations concerning which and what order RM services can be started. The key point of this procedure is to defer starting the collection services until all other RM services required to support the collectors are up and running.

    Start CC Master

    Determine if the CC deployment is using a Zookeeper ensemble. To determine if anensembleis configured, inspect the /etc/default/serviced file and look for the definition:

    SERVICED_ISVCS_ZOOKEEPER_QUORUM

    If the definition SERVICED_ISVCS_ZOOKEEPER_QUORUM is not set, this means the deployment is not using a Zookeeper ensembleand the CC master can be started.

    If the definition SERVICED_ISVCS_ZOOKEEPER_QUORUM is set, this means the deployment is using a Zookeeper ensembleand the CC master can not be started without the other nodes that make up the ensemble. Before starting the CC master, Control Center must be started on the resource pool worker nodes defined by that quorum string.

    Start the CC master:

    # systemctl start serviced

    Monitor the CC log file to verify successful startup:

    # journalctl -u serviced -o cat -f

    Restart Worker Nodes and RM

    For the following steps, verify that RM services for each step are started and show clean health checks before proceeding to the next step. If the services do not start, or do not pass their health checks, stop and resolve the underlying issue before continuing.

    Restart Resource Pool Worker Nodes

    Login to the resource pool worker nodes and restart serviced.

    Note: This step is not required for any resource pool worker node started in the previous step to enable the Zookeeper ensemble.

    Monitor the Hosts page in the CC UI to verify all hosts are up and running.

    Review IP Assignments

    In cases where the system has been restarted following a restore from backup, IP assignments may need to be defined. To review the assignments:

    Navigate to the Zenoss.resmgr page under Applications in the CC UI.

    Review the IP Assignments table.

    If all of the services already have an IP assignment, continue to the next step.

    If any service does not have IP assignment, add an IP assignment for that service. If Automatic assignment does not add the appropriate IP assignments, use a manual IP assignment.

    Start Services & Metrics

    Start the services under HBase. HBase is the first set of services to start because OpenTSDB and the rest of the performance metric pipeline depends on the HBase services.

    Start the remaining services under Infrastructure.

    Start the services underEvents.

    Start the services underUser Interface.

    Start the zproxy service. ZProxy is the topmost service (this is namedZenoss.resmgr in most installations). Clicking the Start button for Zenoss.resmgrservice in the UI provides the choice of starting just that single service, or that service and all of its children. Choose to only start the single Zenoss.resmgr service.

    Start the services under Metrics.

    Start the service(s) namedzenhub.

    Start the remaining collector services.

    Log into RM and spot check data collection and events to verify everything is working properly.

    View Article
  • Pre-Requisites

    Access to a machine with curl installed that can reach the web interface of the Zenoss server.

    Login credentials for a user account with the Manager, ZenManager, or ZenOperator role assigned.

    Applies To

    Zenoss 5.x

    Zenoss 4.x

    Summary

    The following are examples of creating and closing events using the curl command from a Linux terminal. These examples programmatically create and close events on your Zenoss system.

    For these examples:

    Login credentials are admin:zenoss

    Zenoss Resource Manager system is sandbox411.zenoss.loc

    zope/zenwebserver listens on port 8080.

    Note: the two examples are formatted differently to demonstrate two ways of escaping characters in bash. Both are correct; use whichever method you prefer.

    Create an event:

    curl -u "admin:zenoss" -X POST -H "Content-Type:application/json" -d "{\"action\":\"EventsRouter\", \"method\":\"add_event\", \"data\":[{\"summary\":\"test55\", \"device\":\"test-rhel6.zenoss.loc\", \"component\":\"\", \"severity\":\"Critical\", \"evclasskey\":\"\", \"evclass\":\"/App\"}], \"type\":\"rpc\", \"tid\":1}" "sandbox411.zenoss.loc:8080/zport/dmd/evconsole_router"

    Close an event:

    curl -u "admin:zenoss" -X POST -H "Content-Type:application/json" -d '{"action":"EventsRouter","method":"close","data":[{"evids":["0050568a-2045-b48a-11e2-708c5755a8dd"],"params":"{\"severity\":[5,4,3,2,1,0],\"eventState\":[0,1,2]}","limit":1}],"type":"rpc","tid":1}' "sandbox411.zenoss.loc:8080/zport/dmd/evconsole_router"

    UUID Returned

    The uuid returned in the response from the JSON call to create the event is a uuid of the JSON call and is not directly tied to any uuid generated by the event processing system. As such, the uuid for the event will need to be determined before a curl command can be created to close the event. To identify the event's uuid, create a filter and send another request for events that match your filter.

    For more information:

    For more information, see our JSON API documentation @ http://wiki.zenoss.org/Working_with_the_JSON_API

    View Article
  • Version 1.0.0 of the Cloud View ZenPack (ZenPacks.zenoss.CloudView) has been released.

    This release is compatible with Zenoss 6.3 - 6.4 and Zenoss Cloud, and has the following additional requirements

    ZenPacks.zenoss.ZenPacklib 2.1.0 or higher

    This is the first release of the Cloud View ZenPack.

    Detailed information on this ZenPack is available from the ZenPack Catalog.

    View Article
  • Pre-Requisites

    Permissions to edit device templates

    Access to zendmd highly recommended

    Applies To

    Zenoss 4.2.x

    Zenoss 4.1.1

    Zenoss 4.1.0

    Zenoss 4.0.2

    Summary

    It is oftennecessary to set up a threshold based on a percentage instead of a raw value. To do this, we need two things:

    The current value

    Either the minimum or maximum value possible

    Because component templates provide us with some of these minimum or maximum values, it is relatively easy to accomplish in some cases. The information needs to be called from wherever it exists in the object database.

    Procedure

    Component Template Example:

    Component templates provide us with a lot of valuable information that we can use in creating thresholds. One of the most common thresholds that support is asked to assist in creating is memory utilization based on percentage. In this example, we're using a device in the /Server/SSH/Linux device class. We have a few relevant data points available, both for different things:

    mem.MemFree

    The above data point gives us the amount of free (unused) memory available on the system in bytes.

    mem.SwapFree

    The above data point gives us the amount of free (unused) swap available on the system in bytes.

    We can create a threshold based on memory or swap utilization using the maximum values that the component template gathers during modeling. To create a threshold that generates an event based on utilization of either memory or swap, create a MinMaxThreshold based off whichever data point you want to use, and then set the threshold's minimum or maximum include attribute that the component template gathered.

    For this example, setting the Minimum value of a threshold based on the data point MemFree to be 10% of the total memory looks like this:

    here.hw.totalMemory * .1

    The above line references here.hw.totalMemory -- an attribute in the object database that was gathered by the component template during modeling. At the beginning, "here" refers to the device being monitored. The "hw" organizer exists in the object database, and "totalMemory" is one of the attributes stored in it. The "totalMemory" attribute is the total installed physical memory of the system in bytes. We want to multiply it by ".1" to get 10% of the maximum memory. That way, if the memory available drops below 10% of total, we get an event.

    We can do the same for swap:

    here.os.totalSwap * .1

    Note that the above line references a different organizer, os, from which the value totalSwap is pulled.

    View Article
  • Summary

    This document clarifies when the End of Maintenance (EOM) occurs for a particular release.

    Details

    The End of Maintenance date is stated in the Description of Software, Support, and Services (DOSSS) as the following:

    Zenoss is not required to provide maintenance for a version of the software after 12 months following the release of the subsequent version of the software.

    For each major version, maintenance releases are only provided for the most recently released minor version.

    Based on the release dates of our solutions, this breaks down to the following:

    Release

    Generally Available

    End of Maintenance

    Zenoss 2.4

    2009-05-04

    2010-10-26

    Zenoss 2.5

    2009-10-26

    2011-07-15

    Zenoss 3.0

    2010-07-15

    2012-02-14

    Zenoss 3.1

    2011-02-14

    2012-09-01

    Zenoss 3.2

    2011-09-01

    2012-10-27

    Zenoss 4.0

    2011-10-27

    2012-11-14

    Zenoss 4.1.1

    2011-11-14

    2013-12-03

    Zenoss 4.2.2

    2012-12-03

    2014-03-05

    Zenoss 4.2.3

    2013-03-05

    2014-09-30

    Zenoss 4.2.4

    2013-09-30

    2018-06-30

    Zenoss 4.2.5

    2014-05-28

    2018-06-30

    Zenoss 5.0

    2015-02-15

    2016-06-30

    Zenoss 5.1

    2016-03-07

    2017-12-31

    Zenoss 5.2

    2016-11-30

    2018-03-31

    Zenoss 5.3

    2017-08-17

    2018-11-08

    Zenoss 6.0

    2017-11-08

    2018-03-31

    Zenoss 6.1

    2018-01-09

    2018-06-30

    Zenoss 6.2

    2018-06-13

    2019-01-31

    Zenoss 6.3

    2019-01-05

    2019-08-23

    Zenoss 6.4

    2019-08-23

    Current Release

    Maintenance for all ZenPacks supported with a specific release is tied to the specific release EOM date. This includes Service Impact and Analytics.

    FAQ

    Q: What does "End of Maintenance" mean to me as a user?

    A: When a product reaches "End of Maintenance", no more patches will be released for that version. End of Maintenance releases will continue to operate, and support will be provided for the release.

    View Article
  • Version 1.3.3 of the Cisco UCS Central ZenPack (ZenPacks.zenoss.CiscoUCSCentral) has been released.

    This release is compatible with Zenoss 6.2 - 6.4 and Zenoss Cloud, and has the following additional requirements

    CiscoUCS ZenPack

    PythonCollector ZenPack

    The following changes have been made since the previous release: 1.3.2

    Fix import errors in CiscoUCSCentral due to removed dependencies in CiscoUCS 3.0.2 (ZPS-6427)

    This version has been tested with Zenoss Cloud and Service Impact 5.5. Detailed information on this ZenPack is available from the ZenPack Catalog.

    View Article
  • Version 2.1.0 of the Microsoft Azure ZenPack (ZenPacks.zenoss.Microsoft.Azure) has been released.

    This release is compatible with Zenoss 6.2 - 6.4 and Zenoss Cloud, and has the following additional requirements

    ZenPacks.zenoss.PythonCollector >= 1.2

    ZenPacks.zenoss.ZenPackLib>= 2.1.1

    The following changes have been made since the previous release (2.0.0):

    Add new Application Insights and Azure Function components

    Fix events for Service Plan components in case all resources are stopped (ZPS-5652)

    Fix for Random Monitoring of Instance failed or GET Azure metrics unsuccessful failures events (ZPS-6069)

    Fix potential KeyError traceback when modeling (ZPS-6211)

    Ensure complete component model after every modeling attempt (ZPS-6258)

    Update migration scripts to avoid UI flares for some specific Zenoss configurations (ZPS-5737)

    This release has been tested withTested with Zenoss 6.3.x, Zenoss 6.4.x, Zenoss Cloud and Service Impact 5.5.1.Detailed information on this ZenPack is available from the ZenPack Catalog.

    View Article
  • Applies To

    Zenoss 4.x

    Zenoss 3.x

    Zenoss 2.x

    Summary

    Modeling refers to discovering information on a device. Modeling information is gathered by the zenmodeler daemon. This information can include basic configuration such as hardware model, OS information, serial numbers, and software inventory. It can also include components such as network interfaces and filesystems. After a device is modeled, performance monitoring daemons can use the modelling information to determine what to monitor, such as which filesystems to monitor or exclude from monitoring and how to monitor, for example using SNMP index. Note: The ZenVMware zenpack is a special case that uses the zenvmwaremodeler daemon, and is not discussed here).

    The zenmodeler daemon chooses the modeler plugins (these are called collector plugins in the GUI according to the zCollectorPlugins Z-property prior to version 3.x). The daemon then runs the plugins against the device to gather the modeling information. The modeling information is then passed to zenhub to make changes to the ZODB database. By default, modeling runs every 12 hours, starting from the time when the zenmodeler daemon starts. Because modeling can be processor / disk / resource intensive for both devices as well as the Zenoss server, manually scheduling the modeling process can be helpful in managing resources.

    Procedures

    The following procedures describe how to configure the modelling frequency for various Zenoss versions. Follow the instructions for your Zenoss version.

    For Zenoss version 4.x

    Perform the following to configure the modelling frequency using crontab:

    Stop the zenmodeler daemon on all collectors:

    From the command line: Run the following command as the zenoss user: $ zenmodeler stop

    From the Zenoss UI:

    Navigate to Advanced > Daemons on the left hand navigation bar.

    Click the Stop button for the zenmodeler daemon.

    Prevent zenmodeller from running when Zenoss starts up. Remove zenmodeler from the files that control zenmodeler startup:

    Edit the $ZENHOME/etc/daemons.txt file. Also the /$ZENHOME/etc/collectordaemons.txt:, if you are using it on the collectors.

    Locate and delete the zenmodeler line.

    Save and exit the file(s).

    Update the collectors to pick up the change. In the Zenoss UI:

    Navigate Advanced > Collectors.

    Select the remote collectors you want update from the right hand pane.

    Select the menu item Update Collector...

    Create a cron job on each collector for the zenoss user for zenmodeller. Add entries to crontab to schedule zenmodeler: Modify a crontab in /etc to determine when zenmodeller should run. For example, edit /etc/cron.daily/zenoss to add the job: # Remodel Zenoss devices on this collector. 0 0 * * * bash -lc '/opt/zenoss/bin/*_zenmodeler run --daemon --now' Note: Use the "*_ " string if you have multiple collectors. Each collector has a collector1_zenmodeler collector2_zenmodeler. Remove the " *_" string if you have a single collector on the master.

    For Zenoss versions 2.3.x, 3.x

    Perform the following to change the modelling frequency using cron:

    Stop the zenmodeler daemon.

    To stop the currently-running zenmodeler daemon on each collector (you can only have one collector).

    From the command line: Run the following command as the zenoss use: $ zenmodeler stop

    For version 3.x:

    Navigate to Advanced > Daemons on the left hand navigation bar.

    Click the Stop button for the zenmodeler daemon.

    For version 2.3-2.5.2:

    Navigating to Settings > Daemons tab.

    Click the Stop button for the zenmodeler daemon.

    Stop ZenModeler from being automatically restarted in the future. Remove the zenmodeler line from the $ZENHOME/etc/daemons.txt file so that it does not start automatically with the rest of Zenoss.

    In the RM UI, go to Advanced -> Events -> Clear Heartbeats. This will stop you from receiving heartbeat failures for the zenmodeler daemon(s) that are now stopped.

    Add a cronjob to schedule when ZenModeler runs.

    Add a cronjob for the zenoss user on each collector to schedule when you want the modeler to run. You'll want cronjobs like the following on each collector. Modify a crontab in /etc to determine when you want it to run. For example, edit /etc/cron.daily/zenoss to add the job: # Remodel Zenoss devices on this collector. 0 0 * * * bash -lc '/opt/zenoss/bin/*_zenmodeler run --daemon --now' Note: Use the *_ if you have multiple collectors. Each collector has a collector1_zenmodeler collector2_zenmodeler. If you have a single collector on the master, you can remove the " *_" string.

    View Article
  • The CiscoUCS ZenPack (ZenPacks.zenoss.CiscoUCS) has been updated to 3.0.2 from 3.0.1.

    Dependencies:

    Calculated Performance ZenPack

    CiscoMonitor Zenpack

    Dashboard ZenPack

    Dynamic Service View ZenPack

    Predictive Threshold ZenPack

    PythonCollector ZenPack

    ZenPackLib ZenPack

    Changes:

    Fix duplicates of fan module components. (ZPS-6047)

    Fix CIMC session leakage. (ZPS-6110)

    Fix "AttributeError: search" tracebacks during graph update. (ZPS-6194)

    Fix zenchkrels errors for service profiles. (ZPS-6338)

    Fix Authentication failure (Authorization Required 552). (ZPS-5921)

    Tested with Zenoss Cloud, Zenoss 6.4.1 and Service Impact 5.5.1.

    Detailed information on this ZenPack is available from the ZenPack Catalog.

    View Article
  • Version 2.1.1 of the EMC (ZenPacks.zenoss.EMC.base) ZenPack has been released.

    This release is compatible with Zenoss 6.2 - 6.4 and Zenoss Cloud, and has the following additional requirements.

    ZenPacks.zenoss.WBEM >= 2.0.1

    ZenPacks.zenoss.PythonCollector >= 1.5.2

    ZenPacks.zenoss.StorageBase >= 1.3.0

    ZenPacks.zenoss.ZenPackLib >= 2.0.4

    The following changes have been made since the previous release: 2.1.0

    Compatibility with WEBM ZenPack 3.0.0

    Updated Storage Pool graphs to reflect collected data and remove deprecated, unused charts (ZPS-3572)

    Updated Celerra System modeling to no longer break when processing Virtual Data Movers (ZPS-3844)

    Added documentation details for wbemStatsSampleInterval, which is defined on the EMC device (ZPS-3605)

    Corrected links for ZC instances which had invalid links from one component to another (ZPS-4400)

    Tested with Zenoss 6.3.2, Cloud and Service Impact 5.5

    Detailed information on this ZenPack is available in the ZenPack Catalog.

    View Article
  • The Dell EMC Isilon ZenPack (ZenPacks.zenoss.EMC.Isilon), has been updated to 1.0.2

    This release is compatible with Zenoss versions 6.3 - 6.4 and Zenoss Cloud.

    The following changes have been made since the previous release: 1.0.1

    Fix uncatalog errors when deleting node devices (ZPS-6246)

    Fix most uncorroborated edges and multiple providers for Impact (ZPS-6214)

    Correct zenbatchdump output (ZPS-2900)

    IpInterface method override output corrected (ZPS-6182)

    Reduce logging verbosity (ZPS-4975)

    This release has been tested with Zenoss 6.4.1, Zenoss Cloud with Service Impact 5.5.1

    Detailed information on this ZenPack is available from the ZenPack Catalog.

    View Article
  • Version 1.4.4 of the StorageBase (ZenPacks.zenoss.StorageBase) ZenPackhas been released.

    This release is compatible with Zenoss versions 6.3 - 6.4 and Zenoss Cloud andhas the following additional requirements.

    ZenPacks.zenoss.DynamicView

    The following changes have been made since the previous release:1.4.3.

    Do not autogenerate Storage reports. (ZPS-4174)

    Shows a spinner wheel when generating Storage reports. (ZPS-4304)

    This release has been tested with Zenoss 6.4.1, Zenoss Cloud and Service Impact 5.5.1

    Detailed information on this ZenPack is available from the ZenPack Catalog.

    View Article
  • Applies To

    Zenoss 5.1 and greater

    Summary

    It is important to have accurate and tested system backups because they can mitigate problems caused by software or hardware issues. One method to backup Control Center and its installed applications is to use serviced backup. Backups can be performed via the command line or through automated cron jobs (crontab).

    The serviced backup command saves a backup of the current system, including the state of all services, and data to a compressed archive file (.tgz). Because serviced backup leverages functionality provided by serviced snapshot, backups can be performed without the need to shutdown zenoss or the docker containers because they are only momentarily suspended to enable reading the data.

    NOTE: It is important that there is enough free space to receive and store backups because running low on available disk space will result in errors and impact system performance.

    About serviced backup

    The default directory that receives backups from serviced backup is /opt/serviced/var/backups.

    The serviced backup command includes various options to tailor the command for specific use:

    USAGE

    serviced backup [command options] [arguments...]

    DESCRIPTION

    serviced backup DIRPATH

    OPTIONS

    --exclude '--exclude option --exclude option'

    Subdirectory of the tenant volume to exclude from backup

    --help, -h

    Shows the help for an option

    Procedures

    Backups

    Backups can be initiated from the UI or the command line. Zenoss stores its backups by default in the directory /opt/serviced/var/backups.

    Best practices for backup include:

    Weekly backups for production environments.

    Increasing backup frequency to daily during initial deployment or for development environments.

    Backup Using Control Center

    To backup the system through the Control Center UI, navigate to Control Center > Backup/Restore and click the Create Backup button.

    If a backup is run through the UI, the path is controlled by an entry in the /etc/default/serviced file:

    # SERVICED_BACKUPS_PATH=/opt/serviced/var/backups

    Change this entry to specify a new target directory for the UI initiated backups.

    Backup Using the Command Line

    To backup your system through the command line:

    As root or a valid Control Center user, backup the system to the default directory /opt/serviced/var/backups with the following command:

    serviced backup /opt/serviced/var/backups

    A successful backup is indicated by the system displaying the new backup file name, for example:

    backup-2016-09-29-163201.tgz

    Automating Backup with CentOS/RHEL 7

    For automating system backup on CentOS/RHEL 7, crontab alone is not used, instead, bash scripts are used to automate cron, fstrim, zenossdbpack etc. Zenoss is configured with various bash scripts for this purpose.

    For examples of how to create appropriate automation scripts, see the files in/etc/cron.daily/, /etc/cron.hourly/ and /etc/cron.weekly/ in the current installation.

    Note: If automation is used to perform backups, care must be exercised to ensure accumulation of old backups does not consume all disk space. Old backups must be curated to maintain available storage space.

    Example serviced-backup File

    As an example of a script using serviced backup, add a file, called serviced-backup in this example to the appropriate directory with the following contents:

    #!/bin/sh

    # Execute a Control Center backup, writing the results to the backups directory

    ${SERVICED:=/opt/serviced/bin/serviced} backup /opt/serviced/var/backups

    View Article
  • For this example we start off by creating a new domain. Creating a new domain is done by clicking the 'Create' drop-down box and selecting Domain. Below we will utilize the value of 'CPUMemory Domain' as it is best practice to include the type of object after the name. Picture: Add New Domain Screen

    Another good practice is to keep your save locations different for each type of object {Domain, Ad Hoc View, Report}. Below, the 'Data Source' is 'Create with Domain Designer' which will take us to the next section. For the sample report we choose the following tables:

    dim_date

    dim_device

    dim_group

    hourly_load_1

    hourly_load_5

    hourly_load_15

    hourly_mem_avail_real tablesPicture: Selected Tables

    The "dim_date" was chosen because we will be utilizing the "measure date_day_of_week" to get the Nth day of the week and we will be using the "measure date_month" to get the Nth month of the year.

    The "dim_device" is to add the device name in the table and will serve as the base for the left joins.

    The "dim_group" will added to the group name in the table.

    The remaining tables will be left joined to gather additional metrics. Another table will be chosen as an additional base for left joins. The reason for the multiple set of left joins is to align the hour time between the metric tables and display the hour the time. We want to choose the table will the most values. Picture: Domain Joins

    Above an example of the left joins. The "dim_device:device_key" will be used as much as possible to link the other tables to the device. The left join is used because we do not want to eliminate any rows we do not have values for. The second join set is "dim_date:date_key" which is used to line the metrics to the correct day and since this table does not have hour field we will use a third table. The third table base, which is our hand picked metric "table:fct_ts_gmt" is used to align the metric collections based on hour. Our goal is for the collection time to line up across the report for the different metrics we are using.On the Display section of the Domain creation we want to select all the fields or the one join set item. It is recommend to update the name of the fields that are the same before finishing the report. In our example we will update all the "fct_avg" fields with their corresponding table name which will allow easier identification in the ad hoc view creation.Picture: Updating Identification

    Update any of the field's id and description you may want to use in the ad hoc view as this will make the completion of an ad hoc view easier. When we are done, we will click submit.

    To create an Ad Hoc View, click on the top 'Create' and choose 'Ad Hoc View'. Next we want to choose our domain we created or an existing domain. Picture: Selecting the domain

    We are choosing "CPUMemory Domain" and selecting all the fields. After we have completed the setup dialogs we will be configuring the Ad Hoc View page. To create calculated measures, click on the drop-down box to the right of the title 'Measures' and choose 'Create Calculated Measure...'

    Picture: Creating a Calculated MeasureOne of the measures we want to create is the hour in which the data collection was performed. We can do this by using the formula: "hour("fct_ts_gmt")" and "fct_ts_gmt" which comes from the the base of our third join. You may also be able to use "fct_ts_gmt."Picture: Creating Hour of Day

    We want to put a time stamp in our report for the metric, which can be done with the "any_metric:fct_ts_gmt" and "hourly_mem_avail_real:fct_ts." To clean up the view right click on the column and update the name. We will also change the format by right clicking on the column, choosing 'Change Date Format', and selecting "Apr 4, 2019 5:12:46 PM."Picture: Changing Data Format for Hour - Picking the fieldPicture: Changing Data Format for Hour - Changing the format

    Next we will setup the column Day of Week by selecting the "measure dim_date:date_day_of_week." To setup the numeric value of the month we will use the "measure dim_date:date_month."Picture: Measure Day of Week and Month

    The remaining fields are device name, group name, and metrics. The following pictures displays the corresponding fields.

    Picture: Adding the device name

    Picture: Add the device group

    Picture: Choosing metrics for Memory

    Create any filters you are wanting to use by renaming the columns, and when you are done save the Ad Hoc View. Next clicking the floppy icon, choosing Save Ad Hoc View As ..., choosing the name "CPUMemory Ad Hoc View," saving to the AdHocView folder, and clicking Save.

    View Article
  • Applies To

    Resource Manager 5.1.x

    Resource Manager 6.x

    Summary

    The Resource Manager event console is a powerful tool for sorting and filtering events from your infrastructure. However, filtering and sorting can only be performed against event fields that have been indexed. If you find it necessary to index a new field, the following steps will take you through the process.

    Procedure

    To begin, we will create a ZenPack to contain your indexed event details. Create your new ZenPack from the Control Center command line and note that the ZenPack name given is for example purposes only. Please note that this step may take several minutes and will exit back to the command line when complete.

    serviced service run zope zenpack-manager create ZenPacks.companyName.NewEventAttributes

    Launch a new zope shell:

    serviced service shell zope

    Change to the zenoss user:

    su - zenoss

    Change to the zep directory in the new ZenPack with

    cd /opt/zenoss/ZenPacks/ZenPacks.companyName.NewEventAttributes/ZenPacks/zenoss/NewEventAttributes/zep/

    Copy the zep.json.example to zep.json:

    cp zep.json.example zep.json

    Edit the zep.json file to include your indexed event details, like so:

    {

    "EventDetailItem": [

    {

    "key": "rig",

    "type": "1",

    "name": "Rig"

    },

    {

    "key": "host",

    "type": "1",

    "name": "Host"

    },

    {

    "key": "app",

    "type": "1",

    "name": "App"

    }

    ]

    }

    EventDetailItem is the required reserved keyword and is a list of dictionaries where each dictionary specifies a field. The key attribute must match the name of the event attribute; the name field is the user-friendly name used in the GUI. Help for the type field is given in the sample zep.json.example, where a type of 1 defines a string value. Additional information on event detail types can be found in the zep.json.example file.

    In the above example, an event transform creates evt.rig, evt.host and evt.app. The GUI Event Console will allow you to select Rig, Host and App as event columns to filter on. Note the capitalization difference; these are the "friendly" names defined in the "name" fields above.

    Build your egg and move it to your home directory:

    cd $ZENHOME/ZenPacks/ZenPacks.companyName.NewEventAttributespython setup.py bdist_eggcd distcp ZenPacks.companyName.NewEventAttributes* /mnt/pwd

    Drop back to the root user and copy the ZenPack to /mnt/pwd.

    cp /opt/zenoss/ZenPacks/ZenPacks.companyName.NewEventAttributes/dist/ZenPacks.companyName.NewEventAttributes-1.0.0-py2.7.egg /mnt/pwd/

    Exit the container and install the ZenPack:

    exitserviced service run zope zenpack-manager install ZenPacks.companyName.NewEventAttributes*

    Stop zeneventserver and rebuild the Lucene indices:

    serviced service stop zeneventserverTENANT_ID=$(serviced service status | grep Zenoss.resmgr | awk '{print $2}')cd /opt/serviced/var/volumes/$TENANT_ID/zeneventserver/index/rm -rf archive/ summary/serviced service start zeneventserver

    Restart Resource Manager:

    serviced service restart Zenoss.resmgr

    Log into the UI and add the new columns to the Event Console with the Configure > Adjust Columns button.Make sure you scroll down in the Columns to display list - the new fields will be at the bottom.

    View Article
  • Resource Manager 6.4.1 is now Generally Available for customer download. For more information, see the Release Notes.

    View Article
  • Version 3.0.0 of the WBEM ZenPack (ZenPacks.zenoss.WBEM) has been released.

    This release is compatible with Zenoss Cloud, 6.2 - 6.4, and has the following additional requirements.

    ZenPacks.zenoss.PythonCollector (any version)

    The following changes have been made since the previous release: 2.1.0.

    Updated documentation describing the usage of 'wbemStatsSampleInterval'

    Updates to pywbem library

    This release was tested with the following Zenoss versions

    Zenoss Resource Manager 6.2.1

    Detailed information on this ZenPack is available from the ZenPack Catalog.

    View Article
  • Version 1.3.1 of the OpenStack (Tenant View) ZenPack (ZenPacks.zenoss.OpenStack) has been released.

    This release is compatible with Zenoss Cloud and Zenoss 6.2 - 6.3, and has no furtherrequirements.

    The following changes have been made since the previous release.

    Compatibility with Impact 5.5

    Detailed information on this ZenPack is available from the ZenPack Catalog.

    View Article
  • Version 1.1.0 of the Google Cloud Platform ZenPack (ZenPacks.zenoss.GoogleCloudPlatform) has been released.

    This release is compatible with Zenoss Cloud, Zenoss 6, and has no additional requirements.

    The following changes have been made since the previous release: 1.0.2.

    Add support for Dataflow Jobs. (ZPS-5801)

    Add support for Cloud Functions. (ZPS-5719)

    Support "reducer" in Stackdriver Monitoring datasource. (ZPS-5558)

    Support "filter" in Stackdriver Monitoring datasource. (ZPS-5559)

    Support "group by" in Stackdriver Monitoring datasource

    Add support for Instance Group Monitoring (ZPS-3501)

    Add support for Billing Monitoring with services and region analysis (ZPS-5807)

    Support for tags for all components (ZPS-5809)

    Support for Billing reports based on tags and labels (ZPS-6024)

    This release was tested with the following Zenoss versions.

    Zenoss Cloud

    Zenoss Resource Manager 6.3.x and 6.4

    Zenoss Service Impact 5.5.1

    Detailed information on this ZenPack is available from the ZenPack Catalog.

    View Article
  • Version 1.3.2 of the Cisco UCS Central ZenPack (ZenPacks.zenoss.CiscoUCSCentral) has been released.

    This release is compatible with Zenoss 6.2 - 6.3 and Zenoss Cloud, and has the following additional requirements

    CiscoUCS ZenPack

    PythonCollector ZenPack

    The following changes have been made since the previous release: 1.3.1

    Fix import errors from CiscoUCS 3.0.0. (ZPS-5865)

    This version has been tested with Zenoss Cloud and Service Impact 5.5. Detailed information on this ZenPack is available from the ZenPack Catalog.

    View Article
  • Version 3.0.12 of the OracleDatabase Monitor ZenPack (ZenPacks.zenoss.DatabaseMonitor) has been released.

    This release is compatible with Zenoss versions 4.2 - 6.1, and has no further requirements.

    The following changes have been made since the previous release: 3.0.10.

    Fix permissions issues for sysstat datasource (ZPS-2852)

    Fix scale issues in Redo Size (ZPS-2876)

    Security: Hide all Connection String info from process list (ZPS-2185)

    Ensure txojdbc kills java process after timeout (ZPS-2479)

    Change tablespace storage graphs to base 1024 (ZPS-2254)

    Set SGA and PGA graphs to base 1024 (ZPS-2604)

    Fix missing properties and relations in Analytics (ZPS-2597)

    Detailed information on this ZenPack is available from the ZenPack Catalog.

    View Article
  • Applies To

    Zenoss 4.x

    Summary

    Zenoss Resource Manager 4.x uses RabbitMQ for messaging between several application daemons. Resource Manager administrators may therefore wish to be familiar with listing RabbitMQ queues, displaying queue contents, clearing queues, and completing other activities related to troubleshooting RabbitMQ.

    Procedure

    The following tasks are helpful for displaying the RabbitMQ configuration and are run as the root user on the host system.

    To list Zenoss exchanges (where messages are published):

    # rabbitmqctl -p /zenoss list_exchanges

    To list Zenoss queues(where messages are consumed):

    # rabbitmqctl -p /zenoss list_queues

    The following tasks are useful for troubleshooting queues and are run as the zenoss user.

    To show all messages in the specified queue (the zep raw events queue is used as an example):

    $ zenqdump zenoss.queues.zep.rawevents

    To empty a queue, a command can be run to display and acknowledge all messages in the queue. In the sample command below, thezep raw events queue is used as an example.This command will not empty the queue as quickly as possible (another command will be detailed below for emptying a queue more quickly):

    $ zenqdump -A zenoss.queues.zep.rawevents

    The following tasks are useful for troubleshooting queues and are run as the root user.

    To clear queues quickly all queues may be reset by recreating the /zenoss vhost. After stopping Resource Manager, run the following:

    # rabbitmqctl delete_vhost /zenoss

    # rabbitmqctl add_vhost /zenoss

    # rabbitmqctl set_permissions -p /zenoss zenoss '.*' '.*' '.*'

    The following actions will restore RabbitMQ service after the host server is renamed. Renaming a server after initial creation can cause problems because RabbitMQ uses the server hostname in file and directory names. Additionally, changing the Zenoss server's hostname can cause RabbitMQ to stop working because RabbitMQ ties its NODENAME to the hostname when its configuration is created.As a result, the following commands should be run whenever the server hostname is changed, after stopping Zenoss:

    # service rabbitmq-server stop

    # mv /var/lib/rabbitmq /var/lib/rabbitmq.old

    # mkdir /var/lib/rabbitmq

    # chown rabbitmq:rabbitmq /var/lib/rabbitmq

    # service rabbitmq-server start

    # rabbitmqctl add_user zenoss zenoss

    # rabbitmqctl add_vhost /zenoss

    # rabbitmqctl set_permissions -p /zenoss zenoss '.*' '.*' '.*'

    View Article
  • Problem Description

    Following the upgrade of Resource Manager to 6.3.2 (or on a fresh install), Resource Manager is unable to connect to Analytics due to the update ofPythonurllib3library, from version 1.10.2 to version 1.22 (see the Resource Manager 6.3.2 release notes). This affects users using a self signed certificate.

    Users may notice Analytics data not updating or receiving anUnable to talk to Analytics serverflare when navigating in Resource Manager to Reports Configuration. If attempting to submit new or amended configuration data from Resource Manager for Analytics, users may receive aFailed to connect to <Analytics Internal URL> error flare. The following example is what would be found in the zope event log:

    2019-02-05T09:33:06 ERROR zen.etl.router Failed to connect to https://zenoss.example.com:443: Request 'https://zenoss.example.com:443/etl/502bc7a8-8450-11e8-817a-0242ac110018' failed: CERTIFICATE_VERIFY_FAILED certificate verify failed (_ssl.c:579)

    Users are able to sign on to Analytics using Resource Manager authentication. This only affects the Analytics ETL process.

    How to verify urllib3 library is causing your issue

    Navigate to Resource Manager Configuration and confirm if you receive the error flare.

    Amend a value of the configuration settings and submit. Verify the other message and check the zope logs:

    for i in {0..5} ; do serviced service attach Zope/$i cat /opt/zenoss/log/event.log >> event.log ; done

    Attach to a zope container and run the following (updating the URL, following is an example from my lab):

    python -c "import urllib2;z = urllib2.urlopen('https://zenoss.example.com/');x = z.read()"

    Following is a traceback which would confirm the issue:

    Traceback (most recent call last):

    File "<string>", line 1, in <module>

    File "/usr/lib64/python2.7/urllib2.py", line 154, in urlopen

    return opener.open(url, data, timeout)

    File "/usr/lib64/python2.7/urllib2.py", line 431, in open

    response = self._open(req, data)

    File "/usr/lib64/python2.7/urllib2.py", line 449, in _open

    '_open', req)

    File "/usr/lib64/python2.7/urllib2.py", line 409, in _call_chain

    result = func(*args)

    File "/usr/lib64/python2.7/urllib2.py", line 1258, in https_open

    context=self._context, check_hostname=self._check_hostname)

    File "/usr/lib64/python2.7/urllib2.py", line 1214, in do_open

    raise URLError(err)

    urllib2.URLError: <urlopen error [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:579)>

    If nothing is returned then you're not impacted by the urllib3 issue.

    Verify the Analytics certificate hasn't expired and that it has the correct common name

    In a zope container, run either/both of the following two commands to check the expiry date of the Analytics certificate (updating to your Analytics URL):

    curl -v https://zenoss.example.com:443

    openssl s_client -connect zenoss.example.com:443 2>/dev/null | openssl x509 -noout -dates

    The output would look like the following:

    [root@7e0c9c8dfaf0 /]# curl -v https://zenoss.example.com:443

    * About to connect() to zenoss.example.com port 443 (#0)

    * Trying 10.90.36.188...

    * Connected to zenoss.example.com (10.90.36.188) port 443 (#0)

    * Initializing NSS with certpath: sql:/etc/pki/nssdb

    * CAfile: /etc/pki/tls/certs/ca-bundle.crt

    CApath: none

    * Server certificate:

    * subject: [email protected],CN=zenoss.example.com,OU=SomeOrganizationalUnit,O=SomeOrganization,L=SomeCity,ST=SomeState,C=--

    * start date: Jul 26 14:59:47 2017 GMT

    * expire date: Jul 26 14:59:47 2018 GMT

    * common name: zenoss.example.com

    * issuer: [email protected],CN=zenoss.example.com,OU=SomeOrganizationalUnit,O=SomeOrganization,L=SomeCity,ST=SomeState,C=--

    * NSS error -8181 (SEC_ERROR_EXPIRED_CERTIFICATE)

    * Peer's Certificate has expired.

    * Closing connection 0

    curl: (60) Peer's Certificate has expired.

    More details here: http://curl.haxx.se/docs/sslcerts.html

    curl performs SSL certificate verification by default, using a "bundle"

    of Certificate Authority (CA) public keys (CA certs). If the default

    bundle file isn't adequate, you can specify an alternate file

    using the --cacert option.

    If this HTTPS server uses a certificate signed by a CA represented in

    the bundle, the certificate verification probably failed due to a

    problem with the certificate (it might be expired, or the name might

    not match the domain name in the URL).

    If you'd like to turn off curl's verification of the certificate, use

    the -k (or --insecure) option.

    [root@7e0c9c8dfaf0 /]# openssl s_client -connect zenoss.example.com:443 2>/dev/null | openssl x509 -noout -dates

    notBefore=Jul 26 14:59:47 2017 GMT

    notAfter=Jul 26 14:59:47 2018 GMT

    Create a new Analytics certificate

    On the Analytics host, create some new working directories:

    mkdir /tmp/certUpdate

    mkdir /tmp/certUpdate/backup

    Create the new certificate (this example sets the expiry date for 10 years from now):

    openssl req -x509 -nodes -days 3650 -newkey rsa:2048 -keyout /tmp/certUpdate/localhost.key -out /tmp/certUpdate/localhost.crt

    Backup the old certificates:

    mv /etc/pki/tls/certs/localhost.crt /tmp/certUpdate/backup/localhost.crt

    mv /etc/pki/tls/private/localhost.key /tmp/certUpdate/backup/localhost.key

    Copy the new certificates over:

    cp /tmp/certUpdate/localhost.key /etc/pki/tls/private/localhost.key

    cp /tmp/certUpdate/localhost.crt /etc/pki/tls/certs/localhost.crt

    Restart http and the zenoss_analytics service:

    systemctl restart httpd

    service zenoss_analytics stop

    service zenoss_analytics start

    Rerun the checks to verify the expiry date has now been updated

    Update the certificates for Resource Manager

    Launch a zope shell:

    serviced service shell -i -s fixAnalyticsCert zope bash

    Define your URL (Replace 'zenoss.example.com' with your correct Analytics URL)

    ANALYTICS=zenoss.example.com

    Run the check to verify the certificate is still bad:

    python -c "import urllib2;z = urllib2.urlopen('https://"${ANALYTICS}"');x = z.read()"

    Navigate to the CA directory and create the certificate pem file:

    cd /etc/pki/ca-trust/source/anchors/

    openssl s_client -showcerts -connect ${ANALYTICS}:443 </dev/null 2>/dev/null|openssl x509 -outform PEM > analyticsCertfile.pem

    Update the certificate store (do not do by hand and please note there's no output):

    update-ca-trust extract

    Rerun the check to verify the certificate is okay now:

    python -c "import urllib2;z = urllib2.urlopen('https://"${ANALYTICS}"');x = z.read()"

    The above should not return anything.

    Exit the container and commit the snapshot:

    serviced snapshot commit fixAnalyticsCert

    Delete the resulting snapshot tag:

    serviced snapshot rm <tag>

    Restart Resource Manager services:

    serviced service restart zenoss.resmgr

    Issue faced with Resource Manager 6.3.2 and Analytics v5.0.6 & v5.1.0

    Related articles

    https://access.redhat.com/articles/2039753#controlling-and-troubleshooting-certificate-verification-6

    View Article
  • The CiscoUCS ZenPack (ZenPacks.zenoss.CiscoUCS) has been updated to 3.0.1 from 3.0.0.

    Dependencies:

    Calculated Performance ZenPack

    CiscoMonitor Zenpack

    Dashboard ZenPack

    Dynamic Service View ZenPack

    Predictive Threshold ZenPack

    PythonCollector ZenPack

    ZenPackLib ZenPack

    Changes:

    3.0.1

    Add impact relationships for Aggregation Pools. (ZPS-5836)

    Fixes Domain Overview Portlet. (ZPS-5839)

    Correspond UCSProtocol timeout value to the collection interval. (ZPS-5881)

    Fixes event transform for UCS Fault. (ZPS-5941)

    Fixes incorrect events for PSU's. (ZPS-5939)

    Tested with Zenoss Cloud, Zenoss 6.4.0 and Service Impact 5.5.1

    Detailed information on this ZenPack is available from the ZenPack Catalog.

    View Article
  • The HP Proliant ZenPack (ZenPacks.zenoss.HP.Proliant), has been updated to 3.3.3. This maintenance release contains several fixes for various issues.Dependencies:* Zenoss >= 4.2* PythonCollector ZenPack* WBEM ZenPackChanges:- Updated icons with dark-theme compatible versions- ILO modeler timeout should follow zCollectorClientTimeout- Fix exceptions in impact relationship providers- Overhaul impact relationships- Hide credentials from logging (ZPS-4338)- Add support for ILO Gen 7 physical drives (ZPS-5274)- Fix WBEM component metric duplication issue (ZPS-6132)

    Detailed information on this ZenPack is available from the ZenPack Catalog.

    View Article
  • Control Center 1.6.5

    Control Center 1.6.5 is now Generally Available for customer download. For more information, see the release notes.

    View Article
  • Zenoss Service Impact 5.5.1

    Zenoss Service Impact 5.5.1 is now Generally Available for customer download. For more information, see the Release Notes.

    View Article
  • Zenoss Resource Manager 6.4.0

    Resource Manager 6.4.0 is now Generally Available for customer download. For more information, see the Release Notes.

    View Article
  • Version 1.0.2 of the Google Cloud Platform ZenPack (ZenPacks.zenoss.GoogleCloudPlatform) has been released.

    This release is compatible with Zenoss Cloud, Zenoss 6, and has no additional requirements.

    The following changes have been made since the previous release: 1.0.1.

    Fixing compatibility with DynamicView 1.8.0 and Impact 5.5. (ZPS-5666)

    This release was tested with the following Zenoss versions.

    Zenoss Cloud

    Zenoss Resource Manager 6.3.2

    Zenoss Service Impact 5.5.1

    View Article
  • Version 1.11.0 of the PythonCollector ZenPack (ZenPacks.zenoss.PythonColletor) has been released.

    This release is compatible with Zenoss Cloud, Zenoss 6, and has no additional requirements.

    The following changes have been made since the previous release: 1.10.1.

    Support extra datapoint tags in Zenoss Cloud. (ZPS-4587)

    Support publishing ad hoc metrics from plugins. (ZPS-4629)

    Provide better errors to collector when datamap application fails. (ZPS-5833)

    Avoid disabling plugins due to clock jumps by using monotonic clock.

    Allow testing of datasources with readable output. (ZEN-31038)

    This release was tested with the following Zenoss versions.

    Zenoss Cloud

    Zenoss Resource Manager 6.3.2

    View Article
  • Version 2.9.4 of the Microsoft Windows ZenPack (ZenPacks.zenoss.Microsoft.Windows) has been released.

    This release is compatible with Zenoss versions 6.2 - 6.3 and and Zenoss Cloud, and has the following additional requirements.

    ZenPacks.zenoss.PythonCollector 1.4 or higher

    ZenPacks.zenoss.ZenPackLib 2.0.5 or higher

    The following changes/fixes have been made since the previous release: 2.9.3

    Fix Traceback seen for zenoss.winrm.Processes modeler plugin (ZPS-5676)

    Fix Windows ZP: After upgrade to 2.9.3, seeing "caught returnValue being used in a non-generator" errors (ZPS-5727)

    Fix Windows - cluster disks may not be modeled or monitored on 2008 cluster when using Powershell v2.0 (ZPS-5755)

    Fix Unexpected Response --Encrypted Boundary during windows collection (ZPS-5767)

    Document wincommand notification can be deprecated (ZPS-5501)

    Fix Receiving 'NoneType' object has no attribute '__getitem__' when modeling Windows device (ZPS-5770)

    Fix Perfmon command(s) did not start: 'NoneType' object has no attribute 'getConnection' (ZPS-5822)

    Fix "The referenced context has expired" errors (ZPS-5797)

    Fix WinRM device modeler fails to model system OS info if Win32_ComputerSystem doesn't return data (ZPS-5820)

    Fix Windows collection attempts to use a dead connection causing a timeout (ZPS-5819)

    Fix Cluster Monitoring Doesn't Account for Different Service Names (ZPS-5835)

    Fix 'Get-Disk' is not recognized on Windows 2008 Clusters (ZPS-5822)

    This release has been tested with Zenoss Cloud, Zenoss Resource Manager 6.3.2 and Service Impact 5.5.0. Detailed information on this ZenPack is available from the ZenPack Catalog.

    View Article
  • Version 2.5.2 of the Calculated Performance ZenPack (ZenPacks.zenoss.CalculatedPerformance) has been released.

    This release is compatible with Zenoss Cloud, 6.3 - 5.3, 4.2.5, and has the following additional requirements.

    ZenPacks.zenoss.PythonCollector >= 1.9

    The following changes have been made since the previous release: 2.5.1.

    Remove unnecessary impact relationships for aggregation pools. (ZPS-5834)

    This release was tested with the following Zenoss versions.

    Zenoss Cloud

    Zenoss Resource Manager 6.3.2

    Detailed information on this ZenPack is available from the ZenPack Catalog.

    View Article
  • Version 1.4.3 of the Layer2 ZenPack (ZenPacks.zenoss.Layer2) has been released.

    This release is compatible with Zenoss Cloud, 6.3 - 5.3, 4.2.5, and has the following additional requirements.

    ZenPacks.zenoss.PythonCollector >= 1.1

    The following changes have been made since the previous release: 1.4.2

    Improve performance of calculating network impact relationships. (ZPS-5712)

    This release was tested with the following Zenoss versions.

    Zenoss Cloud

    Zenoss Resource Manager 6.3.2

    Detailed information on this ZenPack is available from the ZenPack Catalog.

    View Article
  • Version 4.0.2 of the VMware vSphere ZenPack (ZenPacks.zenoss.vSphere) has been released.This release is compatible with Zenoss versions 5.3 - 6.3 and Zenoss Cloud, and has the following additional requirements.

    PropertyMonitor ZenPack 1.1.0 or later.

    Enterprise Reports ZenPack (any version)

    ZenPackLib ZenPack 2.1.0 or higher

    The following changes have been made since the previous release: 4.0.1.

    Fix invalid threshold for memory capacity on vSphere hosts (ZPS-4979)

    Changing zVSphereModelVSAN property also switch monitoring templates (ZPS-5216)

    Add support for NSX OpaqueNetworks (ZPS-5076)

    Fix "vmodl.fault.MethodNotFound" error from vSAN monitoring against vCenter v6.0 (ZPS-5545)

    Handle situation where VMware Tools return a guest IP address with spaces in it (ZPS-5536)

    Fix sporadic "'NoneType' is not iterable" errors caused by model-monitored datapoints whose values are not updated frequently. (ZPS-5067)

    Migrate Property datasources to vSphere Modeled in local copies of monitoring templates. (ZPS-5628)

    Detailed information on this ZenPack is available from the ZenPack Catalog.

    View Article
  • Version 2.3.3 of the Linux Monitor ZenPack (ZenPacks.zenoss.LinuxMonitor) has been released.

    This release is compatible with Zenoss Cloud, 6.3 - 5.3, 4.2.5, and has the following additional requirements.

    ZenPacks.zenoss.ZenPackLib (any version)

    The following changes have been made since the previous release: 2.3.2.

    Fix and optimize various impact relationship calculations. (ZPS-5664, ZPS-5711, ZPS-5792, ZPS-5806)

    Fix "NotFound" modeling exception for snapshots of thin pools. (ZPS-5816)

    This release was tested with the following Zenoss versions.

    Zenoss Cloud

    Zenoss Resource Manager 6.3.2

    Detailed information on this ZenPack is available from the ZenPack Catalog.

    View Article
  • Version 1.1.3 of the IBM Power ZenPack (ZenPacks.zenoss.IBM.Power) has been released.

    This release is compatible with Zenoss Cloud, 6.3 - 5.3, 4.2.5, and thefollowing changes have been made since the previous release: 1.1.2.

    Fix inconsistencies in Impact relationships

    Do not attempt to set invalid LPAR relations (ZPS-5873)

    Detailed information on this ZenPack is available from the ZenPack Catalog.

    View Article
  • Applies to

    Zenoss 5.0

    Summary

    Docker uses the default 172.17.0.0/16 subnet for container networking. If this subnet is not available for docker in your environment (for example because your network already usesthis subnet), you must configure Docker to use a different subnet. You can perform this process across all the hosts in your system, or only on hosts deployed into environments where the 172.17.0.0/16 unavailable. In a multihost deployment, there is no requirement that all hosts usethe same subnet for Docker container communications.

    Procedure

    Stop the Resource Managerservices running on the host (for example the entire Resource Manager application if this procedure is being completed on a master server).

    Shut down serviced and Docker on the host by typing the following on the host command line:

    $ systemctl stop serviced

    $ systemctl stop docker

    Remove the existing MASQUERADE rules from the POSTROUTING chain in iptables:

    iptables -t nat -F POSTROUTING

    Remove the existing IP address from the Docker bridge device:

    $ ip link set dev docker0 down

    $ ip addr del 172.17.42.1/16 dev docker0

    Pick a subnet you won't need to route to/from your collector. The /24 should be appropriate, unless you require more than 255 containers on a given host. The following example uses 192.168.5.0/24:

    $ ip addr add 192.168.5.1/24 dev docker0

    $ ip link set dev docker0 up

    Verify that the interface has the correct IP set:

    $ ipaddr show docker0

    You should see a result similar to the following (the 'state DOWN' is expected at this stage):

    docker0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN link/ether 56:84:7a:fe:97:99 brd ff:ff:ff:ff:ff:ff inet 192.168.5.1/24 scope global docker0 valid_lft forever preferred_lft forever

    Start Docker:

    $ systemctl start docker

    Verify that the MASQUERADE rule for your new subnet has been added to the POSTROUTING chain:

    $ iptables -t nat -L -n

    As part of the response, you should expect to see the following for your Docker subnet:

    Chain POSTROUTING (policy ACCEPT)

    target prot opt source destination

    MASQUERADE all -- 192.168.9.0/24 0.0.0.0/0

    If you see those expected results, start serviced:

    $ sytemctl start serviced

    View Article
  • Applies To

    Zenoss Resource Manager 5.x

    Summary

    It is often useful to run scripts or troubleshooting tools in docker containers. This KB describes the methods necessary to modify Resource Manager (RM) Docker containers to run scripts or tools and enable them to persist.

    Because Docker containers are initiated from a base image, they are stateless. This means any modifications made to the container are lost upon restart unless those changes are incorporated into the image. Changes include adding or installing scripts and tools within the container.

    The method used to modify the container to make the modifications available to the Zenoss RM depends on how permanent the changes need to be. The choices for permanency include:

    Temporary - the script or tool exists only until the container restarts

    Semi-permanent - the script or tool is replicated to all other containers and persists on container restart but is lost at next upgrade

    Permanent - the script or tool persists following upgrade (Note: although these will be available to any host, there are exceptions depending on use case)

    For additional information on containers, see the KB article titled Virtualization and Docker Containerization for Poets.

    Procedures

    The following procedures explain how to copy tools or scripts into a container and make them persist, depending on the permanency requirements.

    Note: The use case described in this KB suggests the zope container because toolbox requires access to zodb. The actual target will vary based on your use case. Contact Zenoss support if you are unsure which service needs access to the custom tool.

    Temporary Tool or Script

    This procedure makes the tool or script available only temporarily, it will not persist when the container restarts or following an upgrade.

    To make the tool or script persist until an upgrade (semi-permanent), proceed instead to the next section, Semi-Permanent Tool or Script.

    To make the tool or script persist following an upgrade, proceed instead to the next section, Permanent Tool or Script.

    Move the tool or script into the container from a server running SSH:

    Become root if necessary:

    sudo - su

    Attach to the container, for example:

    serviced service attach zope/0

    Become the zenoss user:

    su - zenoss

    SCP the file into the container. For example toolbox.zip:

    scp user@remote_server:~/toolbox.zip .

    NOTE: It is also possible to use rsync to copy the files to the host..

    Execute the script or install the tool. NOTE: Some tools might require installation. For example to install the toolbox utility:

    Change into the directory where the toolbox utility is located.

    Install the toolbox utility:

    easy_install zenoss_toolbox.zip

    Semi-Permanent Tool or Script

    The following procedure makes the tool, script or utility semi-permanent in the container(s). This means it is replicated to all other containers and persists upon container restart but is lost at the next upgrade.

    To make the tool or script persist following an upgrade, proceed instead to the next section, Permanent Tool or Script.

    Shell into a new container and move the tool or script into the container from a server running SSH:

    Become root if necessary:

    sudo - su

    Shell into a new container to create it based on an existing container. For example, spawn the new container from the existing zope container:

    mynew=InstallToolboxserviced service shell -s $mynew -i zope

    Become the zenoss user:

    su - zenoss

    SCP the file into the container. For example, toolbox.zip:

    scp user@remote_server:~/toolbox.zip .

    NOTES:

    It is also possible to use rsync to copy the files to the host.

    Docker enables a feature that files located in the directory where the new container is spawned from are also available within the container directory /mnt/pwd directory.

    Execute the script or install the tool. NOTE: Some tools might require installation. For example to install the toolbox utility:

    Change into the directory where the toolbox utility is located.

    Install the toolbox utility:

    easy_install zenoss_toolbox.zip

    Exit the zenoss user:

    exit

    Exit the container:

    exit

    Commit the container where the changes have been made, for example commit the mynew container:

    serviced snapshot commit $mynew

    Restart the service so the change becomes effective in all containers, for example, zope:

    serviced service restart zope

    Permanent Tool or Script

    The following procedure makes tools or scripts permanent in the container(s) by placing them within a special DFS directory. This special directory mounts to the containers and survives upgrade. This means any tool or script installed in this directory is replicated to all other containers, persists upon container restart and survives upgrade (Note: only for single host instances of 5.x)

    Shell into a new container:

    Become root if necessary:

    sudo - su

    Shell into a new container to create it based on an existing container. For example, spawn the new container from the existing zope container:

    mynew=InstallToolboxserviced service shell -s $mynew -i zope

    Become the zenoss user:

    su - zenoss

    Create a directory within the DFS directory /opt/zenoss/var/ext for your tool or script. For example:

    mkdir /opt/zenoss/var/ext/my_new_dir

    Copy or install your tool or script into the new directory. For example:

    Change into the new directory:

    cd /opt/zenoss/var/ext/my_new_dir

    Copy or install the tool or script into that directory. For example:

    scp user@remote_server:~/toolbox.zip .

    easy_install toolbox.zip

    Exit the zenoss user:

    exit

    Exit the container:

    exit

    Commit the container where the changes have been made, for example commit the mynew container:

    serviced snapshot commit $mynew

    Restart the service so the change becomes effective in all containers, for example, zope:

    serviced service restart zope

    Adding the tool or script to the special DFS directory ensures it will persist following an upgrade.

    After an Upgrade

    Following an upgrade, it is necessary to:

    Reapply patches (via quilt for example)

    Retrieve any stored customized scripts from the /opt/zenoss/var/ext/my_new_dir

    Reinstall any previously installed tools from the /opt/zenoss/var/ext/my_new_dir

    View Article
  • Version 2.3.2 of the Linux Monitor ZenPack (ZenPacks.zenoss.LinuxMonitor) has been released.

    This release is compatible with Zenoss Cloud, 5.3 - 6.2, 4.2.5, and has the following additional requirements.

    ZenPacks.zenoss.ZenPackLib (any version)

    The following changes have been made since the previous release: 2.3.1.

    Guard against out of date sudoers configuration in service monitoring. (ZPS-4334)

    Allow filesystem modeling and monitoring to work with or without sudo access. (ZPS-4340)

    Fix LVM monitoring when */sbin not in user's path. (ZPS-4349)

    Fix undocumented sudo usage of "systemctl status". (ZPS-4121)

    Update reduced recommended sudoers configuration. (ZPS-4121)

    This release was tested with the following Zenoss versions.

    Zenoss Cloud

    Zenoss Resource Manager 6.2.1

    Zenoss Resource Manager 5.3.3

    Detailed information on this ZenPack is available from the ZenPack Catalog.

    View Article
  • Applies To

    Zenoss Resource Manager 5.x

    Summary

    This KB is designed to be a quick primer on datasource creation. It uses the case of monitoring an SSL certificate as the example.

    Procedure

    The following procedure describes how to create a new data source:

    From within the RM GUI, make a local template. Click the gear icon on the bottom left to launch the new template dialog.

    Name the new template, any name you like.

    Create a new datasource. Click the plus (+) icon at the top and select command from the list. In this example it is named ssl. The Edit Data Source dialog displays.

    Enter the following into the Command Template field:

    /bin/bash -c "echo 'OK|days='$$(expr ( $$(echo | openssl s_client -servername ${device/manageIp} -connect ${device/manageIp}:443 2>/dev/null | openssl x509 -noout -dates | grep -v notBefore | date --date="$$(sed 's/notAfter=(.)./\1/')" +'%s') - $$(date +%s) ) / 60 / 24 )"

    (Click for larger image)

    Complete any fields that you require.

    Click SAVE to save and close the dialog.

    Command Notes

    Dollar signs ($) in Commands

    Commands that would normally have a single dollar sign ($), such as a nested argument like the first $(expr) require two dollar signs, $$. This is because the dollar sign is used in the "TALES" engine. For more information about TALES, consult the Appendix of the Administration Guide. An actual TALES variable, ${device/manageIp}, is shown in the example code (above) with a single dollar sign that inserts the monitored device's IP. Anything that is not a TALES variable that contains a dollar sign must be escaped so it does not attempt to interpret the section after.

    Use of /bin/bash Before Command

    The command is prefixed with /bin/bash to initialize paths and other system variables of the command daemon. Because a shell isn't ordinarily used, if /bin/bash is not used before the command, typing in a command such as echo creates an error because echo is defined in the path of the target operating system. After initializing with /bin/bash it is no longer necessary to use full paths for any of the commands, such as echo, expr, sed, etc.

    SSH Checkbox

    The use ssh checkbox toggles between running this command on the monitoring server versus running the command on the target server. In this case it is left unchecked because google.com is accessible from the monitoring server.

    If it is necessary to run a command that only works on the target server (for example, verify that a specific networking path is open), the check box must be ticked to run it from the other server.

    Pad the Data

    In the example, the data is padded with nagios format. This is because if the entire command is put into a shell, the result is something like:

    OK|days=1234567890

    You cannot use a command that only provides a number such as 1234567890 because the system does not understand how to use it. This is because it is possible to have multiple datapoints coming from a single command. You must label the datapoint with the command's output so it will be matched to the name of the datapoint you create in RM. For example:

    (Click for larger image)

    Set Cycle Time

    There is no need to set the cycle time to 300 (5 minutes). Because this value is updated daily, a different value might work better, such as 86400 - the number of seconds in a day.

    Use Scripts

    Ideally, scripts should be used that provide data even when there is no data to provide.

    For example, if SSL is not installed on the target device, the associated variables can be set to 9999.

    As a second example, if you want to set a datapoint called present

    Create a new datapoint called present under the datasource within RM.

    Set it to return something like OK|days=1234567890 present=1 .

    It is possible to put logic in, for example, "if this server is running nginx, check for an ssl; then populate the data accordingly".

    The commands depend on your particular environment. You are free to run anything that works within the target environment and limited only by your imagination.

    View Article
  • Applies To

    All Versions

    Summary

    Administrators must choose an approach for organizing their devices in Zenoss and remain consistent in implementing that approach as devices are added. Zenoss offers the following device organizers:

    Device classes - Provided out-of-the-box with Zenoss . Enables administrators to organize devices by device type. Additional classes can be created at any time. For example, Zenoss includes a device class called /Devices/Server/Linux with the intention that administrators will place Linux servers in this class. Administrators may choose to add sub-classes to the /Devices/Server/Linux class. However, placing devices in particular device class has greater implications for Zenoss monitoring than merely grouping them logically. This KB examines these implications in greater detail and makes recommendations about when new classes should and should not be created.

    Groups - Device organizers can be created by administrators as needed.

    Systems - Device organizers can be created by administrators as needed.

    Locations - Device organizers can be created by administrators as needed.

    Organizing Devices: A Short Case Study

    To fully examine the different approaches administrators can take when organizing devices, consider a fictitious example: Hypothetical Insurance Company. Hypothetical has offices in several cities and began monitoring their devices using Zenoss in Annapolis, Maryland and Washington, DC. The company also has several internal departments, and wants to administer the IT resources of these departments as distinct collections. Their most important departments include Underwriting, Business Development, and Accounting. Hypothetical Insurance could choose from two broad approaches for organizing these devices:

    Method #1 (Recommended): Organize Devices by Creating Groups, Systems and Locations Organizers

    Hypothetical should make use of the Groups Systems and Locations organizers provided specifically for the purpose of logically grouping devices. Hypothetical can create their own organizers as required and specify which devices are associated with each. For example, they can create:- Washington and Annapolis Locations organizersand - Underwriting, Business Development, and Accounting Groups organizers They can then move systems into these organizers as appropriate. Although it appears that devices have physically moved when they are dragged-and-dropped to these organizers within the user interface, the devices aren't actually moved within the Zenoss database. Rather, the Zenoss software sets "Groups" "Systems" and "Locations" organizers as attributes of devices. As described below, this distinction has important implications for Zenoss' operation "under the hood".

    Method #2 (Not Recommended): Create Device Classes to Organize Devices

    Alternatively, Hypothetical could add child classes to each device class to organize their devices. For example, they could create the following child classes of /Devices/Server/Linux/ to organize their Linux servers:

    /Devices/Server/Linux/Annapolis

    /Devices/Server/Linux/Washington

    These child organizers could be further expanded to separate systems in different departments:

    /Devices/Server/Linux/Annapolis/Underwriting

    /Devices/Server/Linux/Washington/Business_Development

    These steps could be repeated for each device class Hypothetical uses.

    Creating device classes to organize devices is not recommended for two primary reasons: 1) Creating devices classes to organize devices can make it very difficult for administrators to find the devices they are looking for. For example, if someone needed to find all of the servers (of any type) in the Underwriting department, he or she would need to hunt through all of the child classes under /Server, identifying and writing down the names of the devices they are looking for. The broader the administrator's query, the more severe this problem becomes.2) Creating classes to logically group devices unnecessarily increases the complexity of the Zenoss back end database. This increased complexity can contribute to performance degradation. Unlike adding "Groups" or "Systems" organizers - that are attributes defined for devices - a new device class constitutes a new location within the database with its own set of attributes and relationships with other database objects.When practical, it is preferable to avoid the introduction of the increased database complexity associated with the creation of new classes unless it is necessary for custom monitoring of a subset of devices. Because Hypothetical is creating organizers only for the purpose of logically organizing their devices, with no changes in the monitoring or modeling specifics for subsets of devices, they should choose Method #1 above and make use of "Groups," "Systems" and "Locations" organizers.

    When Hypothetical Should Create New Device Classes

    New device classes should be created when one or more of the following need to be unique for a subset of devices:

    monitoring template bindings

    modeler plugin associations

    configuration properties

    Consider a concrete example at Hypothetical Insurance. For a subset of Windows servers, system administrators at Hypothetical have decided to fetch server logs related to print spooler failures so they appear in the Zenoss event console. Their method for doing this is to add an additional data source to the Windows server template, but in order to prevent their event console from being clogged up with entries from servers they are unconcerned with, they only want to apply the modified template to a subset of their Windows servers. This scenario would be a perfect use case for the creation of a new sub class of the /Server/Microsoft/Windows device class. Here are the steps Hypothetical administrators would take to create the new class:

    Create the new Windows sub-class. For example: /Server/Microsoft/Windows/Print_Logging/

    Create a copy of the standard Windows monitoring template for the new class ("copy/override" the template)

    Add the additional data point to the new template

    Move the target servers into the new class

    Note that no changes will be made to the target servers' "Groups," "Systems" or"Locations" organizer settingsby moving the target servers to the new sub class. When and if the need to fetch the print spooler events from one or more of the servers ends, they can be moved back to their original device class.

    View Article
  • Applies To

    Zenoss 5.x

    Summary

    Hardware failure in Control Center can take various forms, including:

    Running out of disk space on one or more of the partitions that store Control Center, Docker or Zenoss data.

    Power failure on a Control Center host.

    In either case, data might not have been written to disk, leaving your system in an unusable state.

    Symptoms

    The symptoms of low disk space or power failure include system instability, data loss, and log file entries.

    Procedures

    The following sections describe some possible hardware failure results and their associated remediation steps.

    How to Check Diskspace

    The normal du/df commands do not provide useful information when applied to logical volumes managed by the Logical Volume Manager (LVM). This is because there are volume groups that consist of physical volumes that are configured into logical volumes. Use the following LVM-specific commands to get more information.

    Identify Volume Groups

    To determine which volume groups exist, issue the following command:

    # vgs

    Display Volume Group Information

    To determine the volume group total space, how much free space exists within the volume group and how much space is allocated to logical volumes, issue the following command:

    # vgdisplay [vg_name]

    Identify Physical Volumes

    To determine which physical volumes make up the volume group and which logical volumes exist within the volume group, include -v:

    # vgdisplay -v [vg_name]

    Display Logical Volume Sizes

    To display the size of logical volumes:

    # lvs [-a] [--units hHbBsSkKmMgGtTpPeE]

    Note: The lvs command can output in units you choose. From the lvs manpage :

    --units hHbBsSkKmMgGtTpPeE

    All sizes are output in these units: (h)uman-readable, (b)ytes,

    (s)ectors, (k)ilobytes, (m)egabytes, (g)igabytes,(t)erabytes,

    (p)etabytes, (e)xabytes. Capitalise to use multiples of 1000 (S.I.)

    instead of 1024. Can also specify custom units e.g. --units 3M

    For additional information on the Logical Volume Manager, see the Red Hat Enterprise Logical Volume Manager Administration, LVM Administrator Guide located at: https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/6/html/Logical_Volume_Manager_Administration/index.html.

    Recovery Steps For Various Scenarios

    The Docker Filesystem (/var/lib/docker) Out Of Space

    If /var/lib/docker has no available disk space, delete existing containers to free up space, for example:

    Shut down serviced.

    Delete all existing containers:

    docker rm $(docker ps -qa)

    If some Docker metadata wasnt fully written to disk, problems can manifest in various ways, for example:

    The Docker daemon could refuse to start. This is often caused by the presence of one or more zero-length files in /var/lib/docker (specifically, /var/lib/docker/trust/official.json and /var/lib/docker/repositories-btrfs are known offenders). You can safely delete these files and restart Docker to recover from this.

    The serviced daemon could fail to start one or more internal services, logging API Error (500). The Docker logs will show a more specific error. This has, so far, only been seen on Docker versions less than 1.6.0. This occurs when Dockers own internal graph database is corrupted. The Docker logs reported, for example Cannot find child for /serviced-isvcs_logstash. To correct this issue, perform the following:

    Shut down serviced (it should shut down on its own anyway)

    Run the command docker ps -a to display all (stopped) containers remaining on the system. Several of them might have no status or name. This is the problem.

    Remove all containers:

    docker rm $(docker ps -aq)

    Start serviced

    Corruption Of Control Center Zookeeper (/opt/serviced/var/isvcs/zookeeper) Files

    The most likely effect of Zookeeper becoming corrupted is that services will not start, or will not start correctly. They might also report bad networking imports. Virtual hosts might not work properly. This data can be rebuilt. Perform the following:

    Shut down serviced

    Remove the directory:

    sudo rm -rf /opt/serviced/var/isvcs/zookeeper

    Delete all existing containers:

    docker rm $(docker ps -qa)

    Start serviced

    Restart serviced on all remote hosts

    Corruption of Control Center HBase/OpenTSDB (/opt/serviced/var/isvcs/opentsdb) Files

    If the internal HBase becomes corrupted, it may be possible for it to recover by itself. It will attempt this on startup, and log accordingly. The default heap settings may, on very large systems, be inadequate for this recovery process. Youll know because it will keep shutting itself down with an error indicating its out of heap. You can increase the heap temporarily:

    Attach to the running container:

    docker exec -it serviced-isvcs_opentsdb bash

    Modify the max heap size:

    echo "export HBASE_HEAPSIZE=2048" >> /opt/hbase*/conf/hbase-env.sh

    Restart HBase:

    supervisorctl -c /opt/zenoss/etc/supervisor.conf

    restart hbase

    These settings will be reverted when serviced is restarted. If HBase repairs its corruption successfully, it will start normally. If the repair fails, the HMaster logs will indicate this and you might need to proceed with additional HBase recovery, for example:

    Attach to the running container:

    docker exec -it serviced-isvcs_opentsdb bash

    Run the HBase repair tool:

    JAVA_HOME="/usr/lib/jvm/java-7-openjdk-amd64" HBASE_HOME=/opt/hbase-0.94.16 /opt/hbase-0.94.16/bin/hbase hbck -fix

    If HBase is corrupted beyond repair, you might need to remove the existing data to allow it to start. Perform the following:

    On the master host, stop all processes in the internal metrics container:

    docker exec -it serviced-isvcs_opentsdb supervisorctl -c /opt/zenoss/etc/supervisor.conf stop all

    Remove the HBase data:

    rm -rf /opt/serviced/var/isvcs/opentsdb/hbase/.*

    Start the metrics processes:

    docker exec -it serviced-isvcs_opentsdb supervisorctl -c /opt/zenoss/etc/supervisor.conf start all

    Corruption of Zenoss RabbitMQ (/opt/serviced/var/volumes/*/rabbitmq) Files

    If RabbitMQ data has become corrupted, Rabbit will be unable to start. It is possible that it will start and some processes will be unable to connect. If this happens, you should remove the existing data. Any messages that were in the queues when the hardware failure occurred will be lost.

    Stop the RabbitMQ service:

    serviced service stop rabbitmq

    Delete the RabbitMQ data:

    export SERVICE_ID=$(serviced service status Zenoss.resmgr | sed -n '2p' | awk {'print $2'})

    export SVCROOT=/opt/serviced/var/volumes/$SERVICE_ID

    rm -rf $SVCROOT/rabbitmq

    rm -f $SVCROOT/.rabbitmq.serviced.initialized

    Restart RabbitMQ:

    serviced service start rabbitmq

    Corruption of Zenoss HBase (/opt/serviced/var/volumes/*/hbase-master) Files

    HBase can be recovered in a similar way as the internal HBase. Perform the following

    Attach to the running container:

    serviced service attach hmaster

    Run the HBase repair tool:

    su - hbase -c hbase hbck -fix

    If HBase is corrupted beyond repair, you may need to remove the existing data to allow it to start.

    Warning: Performing the following steps removes all performance data.

    To remove existing data to enable HBase to start, perform the following:

    On the master host, stop all HBase and OpenTSDB processes:

    serviced service stop hbase serviced service stop opentsdb

    Remove the HBase data:

    export SERVICE_ID=$(serviced service status Zenoss.resmgr | sed -n '2p' | awk {'print $2'})

    export SVCROOT=/opt/serviced/var/volumes/$SERVICE_ID

    sudo rm -rf $SVCROOT/hbase-*

    sudo rm -rf $SVCROOT/.hbase-*.serviced.initialized

    Start HBase and OpenTSDB:

    serviced service start hbase serviced service start opentsdb

    Corruption of Zenoss Zookeeper (/opt/serviced/var/volumes/*/hbase-zookeeper-*) Files

    It is possible, though not particularly likely, for the Zookeeper instance(s) used for HBase to become corrupted on hardware failure. Recovery consists of removing the corrupted data:

    Stop HBase:

    serviced service stop hbase

    Delete the Zookeeper data:

    export SERVICE_ID=$(serviced service status Zenoss.resmgr | sed -n '2p' | awk {'print $2'})

    export SVCROOT=/opt/serviced/var/volumes/$SERVICE_ID

    sudo rm -rf $SVCROOT/hbase-zookeeper-*

    sudo rm -rf $SVCROOT/.hbase-zookeeper-*.serviced.initialized

    Restart HBase:

    serviced service start hbase

    Corruption of Event Indexes (/opt/serviced/var/volumes/*/zeneventserver/index)

    If the Lucene-based event index becomes corrupted, zeneventserver will automatically rebuild it oncethe corrupted index data is removed. Perform the following:

    Stop zeneventserver:

    serviced service stop zeneventserver

    Delete the index data:

    export SERVICE_ID=$(serviced service status Zenoss.resmgr | sed -n '2p' | awk {'print $2'})

    export SVCROOT=/opt/serviced/var/volumes/$SERVICE_ID

    rm -rf $SVCROOT/zeneventserver/index

    Start zeneventserver:

    serviced service start zeneventserver

    The zeneventserver log will indicate the indexes are being rebuilt.

    Corruption of Catalog Service Indexes (/opt/serviced/var/volumes/*/zencatalogservice)

    If the Lucene-based model index becomes corrupted, you must remove the data and rebuild the catalog. Perform the following:

    Stop zencatalogservice:

    serviced service stop zencatalogservice

    Remove the catalog data:

    export SERVICE_ID=$(serviced service status Zenoss.resmgr | sed -n '2p' | awk {'print $2'})

    export SVCROOT=/opt/serviced/var/volumes/$SERVICE_ID

    rm -rf $SVCROOT/zencatalogservice

    rm -rf $SVCROOT/.zencatalogservice.serviced.initialized

    Start zencatalogservice:

    serviced service start zencatalogservice

    Withzencatalogservice started, rebuild the catalog: (Note, because this can require time to complete, perform this in a screen session)

    serviced service attach zope/0 su - zenoss -c "zencatalog run --createcatalog --forceindex"

    Corruption of Redis (/opt/serviced/var/volumes/*/redis) Files

    If hardware failure occurs in the middle of a snapshot, redis may write an incomplete or zero-length database dump to disk. The redis server will be unable to start and logs will show that it is unable to load the database. Additionally, serviced will continually attempt to restart it. Recovery requires deleting the bad snapshot file. (NOTE: any in-flight data in Redis at the time of hardware failure will be lost).

    Attach to the redis container:

    serviced service attach redis

    Check the logs to verify the issue:

    tail -f /var/log/redis/redis.log

    If this is the problem, delete the bad snapshot:

    rm -f /var/lib/redis/dump.rdb

    Redis should recover when it is restarted by serviced (usually within 10 seconds).

    View Article
  • Version 2.1.0 of the Ceph ZenPack (ZenPacks.zenoss.Ceph) has been released.

    This release is compatible with Zenoss 6.2 - 6.3 and Zenoss Cloud, and has the following additional requirements

    ZenPacks.zenoss.PythonCollector >= 1.6.1

    The following changes have been made since the previous release: 2.0.1

    Add backend_addr to host components (ZPS-281)

    Add sudo support for Docker containers (ZPS-5736)

    Use sudo fallback for .asok sockets in modeler (ZPS-5246)

    Change Pool IO to rate instead of gauge (ZPS-3039)

    Clarify and enhance docs for sudoers (ZPS-270)

    Clarify docs on installing SSH device (ZPS-5742)

    Fix pre-Luminous health-check errors (ZPS-5738)

    Guard against missing d.cephProxyComponentUUID in DeviceProxy (ZPS-4945)

    This version has been tested with Zenoss Cloud, Zenoss 6.3.2 and Service Impact 5.3.4. Detailed information on this ZenPack is available from the ZenPack Catalog.

    View Article
  • Version 1.0.0 of the DMTF Redfish ZenPack (ZenPacks.zenoss.Redfish) has been released.

    This release is compatible with Zenoss 6.2 - 6.3 and Zenoss Cloud, and has the following additional requirements

    ZenPacks.zenoss.PythonCollector

    ZenPacks.zenoss.ZenPacklib 2.1.0 or higher

    This is the first release of the Redfish ZenPack.

    Detailed information on this ZenPack is available from the ZenPack Catalog.

    View Article
  • Monitoring a Microsoft Cluster

    The first step is to add the virtual hostname of the cluster to Resource Manager in the /Server/Microsoft/Cluster class path. The cluster nodes will automatically be added to /Server/Microsoft/Windows class path. Once modeled, any cluster nodes associated with the cluster are added to device class /Server/Microsoft/Windows. The only three valid modeler plugins for device class /Server/Microsoft/Cluster are WinCluster, OperatingSystem, and WinMSSQL.

    Do not add the WinCluster plugin to the /Server/Microsoft/Windows device class. The /Server/Microsoft/Windows device class uses a python class that is different from the python class used in the /Server/Microsoft/Cluster device class, and they have different relationships.

    Monitoring MSSQL

    When monitoring MSSQL Enterprise edition using Fail Over mode the WinMSSQL modeler plugin should only be on the cluster hostname. For Single Server mode the WinMSSQL modeler plugin is installed on the physical node running MSSQL. It is not possible to monitor a Fail Over mode from a single host in /Server/Microsoft/Windows.

    View Article
  • Version 3.0.0 of the Cisco UCS ZenPack (ZenPacks.zenoss.CiscoUCS) has been released. This is a major release that brings capacity related features and other enhancements into the ZenPack.

    This release is compatible with Zenoss versions 6.2 - 6.3 and Zenoss Cloud, and has the following additional requirements.

    CiscoMonitor ZenPack >= 4.0

    Calculated Performance ZenPack >= 2.5.1

    Dashboard ZenPack

    DynamicView ZenPack

    Predictive Threshold ZenPack

    PythonCollector ZenPack

    ZenPackLib ZenPack >= 2.1

    Note: This version of CiscoUCS ZenPack fully replaces UCSCapacity - If you are using UCSCapacity ZenPack in your Zenoss installation, it should be upgraded to 2.0.0 before upgrading CiscoUCS 2.x to CiscoUCS 3.0.0. UCSCapacity can be removed after CiscoUCS has been upgraded.

    The following changes have been made since the previous release: 2.8.0.

    Add all UCSCapacity features

    Upgrade to ZenPackLib 2.x

    Fix "Edit with Domain Designer" issue on Cisco UCS Manager Domain. (ZPS-1671)

    Fix massively growing counts for Cisco UCS faults. (ZPS-3202)

    Do not autogenerate CiscoUCS reports while opening. (ZPS-4175)

    Show a spinner wheel when generating Storage reports. (ZPS-4305)

    Ensure CIMC session closing after collection when zCiscoUCSCIMCReuseSessions is false. (ZPS-4555)

    Guard against empty object maps in modeler. (ZPS-5617)

    Allow user specified severity fields for UCS Manager events. (ZPS-5199)

    Fix slider in Dependency View to update resources by utilization. (ZPS-5643)

    Update Topology View for S3260 storage chassis. (ZPS-3576)

    This version has been tested with Zenoss Cloud, Zenoss 6.3.2 and Service Impact 5.3.4. Detailed information on this ZenPack is available from the ZenPack Catalog.

    View Article
  • Version 4.0.0 of the NetApp Monitor ZenPack (ZenPacks.zenoss.NetAppMonitor) has been released.

    This release is compatible with Zenoss versions 6.2 - 6.3 and Zenoss Cloud, and has the following additional requirements.

    ZenPacks.zenoss.PythonCollector >= 1.8.0

    ZenPacks.zenoss.StorageBase >=1.4.3

    ZenPacks.zenoss.CalculatedPerformance (any version)

    ZenPacks.zenoss.ZenPackLib >= 2.1

    The following changes have been made since the previous release: 3.7.0.

    Add SNMP datapoints and new graphs on device level and for LUN and HardDisk components (ZPS-2451)

    Add support for monitoring events of OnCommand Unified Manager (ZPS-2961)

    Convert monitoring templates to ZPL format (ZPS-4326)

    Convert NetApp components to ZPL format (ZPS-4326)

    Extend support of SNMP monitoring with additional datapoints and events (ZPS-2451)

    Fix thresholds for inode usage for offline volumes (ZPS-5394)

    Improve Impact and Dependency View relations (ZPS-4326)

    Move frequently changing metrics from modeling to monitoring (ZPS-5114)

    This version has been tested with Zenoss Resource Manager 6.3.2, Zenoss Cloud and Service Impact 5.3.4. Detailed information on this ZenPack is available from the ZenPack Catalog.

    View Article
  • Version 1.0.2 of the Nutanix ZenPack (ZenPacks.zenoss.Nutanix) has been released.

    This release is compatible with Zenoss 6.2 - 6.3 and Zenoss Cloud, and hasthe following requirements.

    ZenPacks.zenoss.ZenPackLib 2.0.8 or higher

    ZenPacks.zenoss.PythonCollector

    The following changes have been made since the previous release: 1.0.1.

    Fix minor log warnings (ZPS-5620)

    Fix incorrect datapoint for vm/cvm memory_usage_pct (ZPS-2904)

    Fix unhandled plugin errors while monitoring (ZPS-3428)

    Fix modeling when cluster uuid missing from API (ZPS-4189)

    This release has been tested with Zenoss Cloud, Zenoss 6.3.2 and Service Impact 5.3.4. Detailed information on this ZenPack is available from the ZenPack Catalog.

    View Article
  • Version 2.9.3 of the Microsoft Windows ZenPack (ZenPacks.zenoss.Microsoft.Windows) has been released.

    This release is compatible with Zenoss versions 6.2 - 6.3 and and Zenoss Cloud, and has the following additional requirements.

    ZenPacks.zenoss.PythonCollector 1.4 or higher

    ZenPacks.zenoss.ZenPackLib 2.0.5 or higher

    The following changes/fixes have been made since the previous release: 2.9.2

    Fix deprecated Get-WmiObject cmdlet for PowerShell Core (ZPS-4927)

    Fix Windows Perfmon data collection stops for long time after device reboot (ZPS-4473)

    Fix Windows - No freespace on cluster shared volumes (ZPS-4612)

    Fix Better handling in Perfmon datasource of "is not recognized as the name of a cmdlet" errors (ZPS-3517)

    Fix 500 Operation Timeout Errors when modeling and/or monitoring SQL Server (ZPS-4638)

    Fix Windows Cluster - wrong ip address can be returned for sql server node (ZPS-4703)

    Fix Windows Cluster - sql server fails over doesn't trigger a remodel if cluster group stays the same (ZPS-4707)

    Fix Windows Perfmon data collection stops for long time after collection interruption (ZPS-4473)

    Fix Windows may connect to device with wrong zWinRMUser (ZPS-3564)

    Fix Microsoft.Windows - Detect .NET version better for EventLogDataSource (ZPS-5399)

    Fix Windows Disconnected Network Drives that Cause PowerShell Error (ZPS-4866)

    Fix After upgrade to 2.9.x, some/most modeler plugins fail with "'NoneType' object has no attribute 'getConnection'" (ZPS-5087)

    Fix Windows Cluster - clusterOwnerChange may not be in zWindowsRemodelEventClassKeys after install/upgrade (ZPS-4887)

    Fix Windows Cluster - SQL Server instance metrics may not be found (ZPS-4888)

    Fix WindowsServiceLog "The referenced context has expired" error (ZPS-3216)

    Fix Add ERROR handling for empty win32_SystemEnclsoure data (ZPS-5253)

    Fix Windows devices monitored over https regularly fail collection (ZPS-5323)

    Fix Increase Flexibility in Microsoft ZenPack for Data Source using Microsoft's Event Log (ZPS-5585)

    Fix Windows Cluster - Missing or no data returned when querying job events do not clear after failover remodel (ZPS-4874)

    This release has been tested with Zenoss Cloud, Zenoss Resource Manager 6.3.2 and Service Impact 5.3.4. Detailed information on this ZenPack is available from the ZenPack Catalog.

    View Article
  • Version 5.10.0 of the Cisco Devices ZenPack (ZenPacks.zenoss.CiscoMonitor) has been released.

    This release is compatible with Zenoss versions 6.2 - 6.3 and Zenoss Cloud, and has the following additional requirements.

    ZenPacks.zenoss.PythonCollector >= 1.4

    ZenPacks.zenoss.ZenPackLib >= 2.0.5

    The following changes have been made since the previous release: 5.9.0.

    Add support for ASA VPN tunnels. (SVC-2005)

    Add missing hardware models. (ZPS-3515)

    Add Impact policies to CiscoDevice for Fans, PSU, and Temperature Sensor. (ZPS-3972)

    Handle unknown StandbyState in cHsrpStateChange. (ZPS-3644)

    Improve performance of snmp trap transform.

    Make network interface descriptions searchable. (ZPS-3949)

    Fix for "invalid literal for int()" errors in ISDN monitoring. (ZPS-3698)

    Fix for "AttributeError: vnis" in zenmapper.log. (ZPS-3965)

    Do not autogenerate Cisco Inventory report. (ZPS-4204)

    Shows a spinner wheel when generating Cisco Inventory report. (ZPS-4253)

    Add Device-Modern template and change current templates for ASA Devices. (ZPS-3598)

    This version has been tested with Zenoss 6.3.2 and Zenoss Cloud.Detailed information on this ZenPack is available from the ZenPack Catalog.

    View Article

Curious about Zenoss?

Anonymously Ask Zenoss Any Question

Ask Anonymous Question

×
Rate your company