sábado, 4 de agosto de 2012

Agente EM 10.2 con errores intermitentes

Hace unos días una instalación Oracle 10G con Grid Control empezó a generar mensajes con el subject: "EM Alert: Unreachable Start: test.myhost.com - Agent has stopped monitoring. The follo..."

El contenido del mensaje incluía este texto:

    Target Name=test.myhost.com
    Target Type=Host
    Host=test.myhost.com
    Metric=Status
    Severity=Unreachable Start
    Message=Agent has stopped monitoring. The following errors are reported : COLL_DISABLED|DISK_FULL.
    Notification Rule Name=Host Availability and Critical States
    Notification Rule Owner=SYSMAN
    Notification Count=1


Seguido de otro mensaje "EM Alert: Unreachable Clear: test.myhost.com:1831 - Agent Unreachability is cleared. The cu..."

Y este otro mensaje contenía el texto:
    "Message=Agent Unreachability is cleared. The current status of the target is UP."

Una búsqueda de este mensaje en el sitio de soporte muestra la nota ID 396238.1 - 'emctl status agent' Command Shows "Collection Status Disabled By Upload Manager", que describe este caso e indica revisar el log del agente para encontrar el motivo del error.

Así que revisando el archivo sysman/log/emagent.trc en mi $GRID_HOME (/u01/app/oracle/product/10.2.0/GridControl10g/agent10g/test.myhost.com) se ve que el problema fue que se excedió la cantidad máxima de datos permitida para enviar al repositorio, de 50Mb por defecto:

    [oracle@test]$ tail sysman/log/emagent.trc

    2012-07-09 10:30:14,707 Thread-582644352 WARN  collector: enable collector
    2012-07-09 10:30:14,783 Thread-582644352 WARN  collector: Regenerating all DefaultColls
    2012-07-09 10:43:52,823 Thread-137835136 ERROR upload: Exceeded max. amount of upload data: 26 files, 51.584184 MB Data. 88.19% of disk used. Disabling collections.
    2012-07-09 10:43:52,823 Thread-137835136 WARN  collector: Disable collector
    2012-07-09 10:44:52,343 Thread-128397952 WARN  upload: Amount of upload data lowered sufficiently. enabling collections and regenerating metadata
    2012-07-09 10:44:52,343 Thread-128397952 WARN  TargetManager: Regenerating all Metadata
    2012-07-09 10:44:52,599 Thread-128397952 WARN  upload: Truncating value of "SHORT_NAME" from "Average Synchronous Single-Block Read Latency (ms)" to "Average Synchronous Single-Block Read La"
    2012-07-09 10:44:53,078 Thread-128397952 WARN  collector: enable collector


Siguiendo las recomendaciones de la nota, bajamos el agente y ampliamos este valor a 80mb y volvemos a iniciar el agente:

    [oracle@test]$ bin/emctl stop agent
    Oracle Enterprise Manager 10g Release 4 Grid Control 10.2.0.4.0.
    Copyright (c) 1996, 2007 Oracle Corporation.  All rights reserved.
    Stopping agent ... stopped.

    [oracle@test]$ vi sysman/config/emd.properties
    # UploadMaxBytesXML=50
    UploadMaxBytesXML=80

    [oracle@test]$ bin/emctl start agent
    Oracle Enterprise Manager 10g Release 4 Grid Control 10.2.0.4.0.
    Copyright (c) 1996, 2007 Oracle Corporation.  All rights reserved.
    Starting agent ........ started.


Luego validamos que el agente está funcionando normalmente:

    [oracle@test]$ bin/emctl status agent
    Oracle Enterprise Manager 10g Release 4 Grid Control 10.2.0.4.0.
    Copyright (c) 1996, 2007 Oracle Corporation.  All rights reserved.
    ---------------------------------------------------------------
    Agent Version     : 10.2.0.4.0
    OMS Version       : 10.2.0.4.0
    Protocol Version  : 10.2.0.4.0
    Agent Home        : /u01/app/oracle/product/10.2.0/GridControl10g/agent10g/test.myhost.com
    Agent binaries    : /u01/app/oracle/product/10.2.0/GridControl10g/agent10g
    Agent Process ID  : 7537
    Parent Process ID : 9232
    Agent URL         : https://test.myhost.com:1831/emd/main
    Repository URL    : https://oragrdp1.myhost.com:1159/em/upload
    Started at        : 2012-07-04 07:22:37
    Started by user   : oracle
    Last Reload       : 2012-07-05 14:09:50
    Last successful upload                       : 2012-07-09 10:47:33
    Total Megabytes of XML files uploaded so far : 25865.11
    Number of XML files pending upload           :        0
    Size of XML files pending upload(MB)         :     0.00
    Available disk space on upload filesystem    :    12.01%
    Last successful heartbeat to OMS             : 2012-07-09 10:46:57
    ---------------------------------------------------------------

    Agent is Running and Ready

Con esto los mensajes de error no se generaron nuevamente. Un saludo.