Skip to content

Monitoring and Observability

A production Fineract deployment has several subsystems that need ongoing monitoring: the application itself, the database connection pools, the nightly COB batch run, and the business events pipeline. This guide covers the endpoints, queries, and log patterns to use.

Health Check

Spring Boot Actuator exposes a health endpoint that reports the overall status of the application and its dependencies:

bash
curl https://your-instance.finecko.com/fineract-provider/actuator/health \
  -H "Fineract-Platform-TenantId: default"

Response when healthy:

json
{
  "status": "UP",
  "components": {
    "db": { "status": "UP" },
    "diskSpace": { "status": "UP", "details": { "free": 42949672960, ... } },
    "ping": { "status": "UP" }
  }
}

A status: DOWN on the db component means Fineract cannot reach its tenant store database - API requests will start failing. Monitor this endpoint from your infrastructure health check system and alert immediately on any status other than UP.

For Finecko managed instances, the platform continuously monitors this endpoint. Infrastructure alerts are handled by Finecko operations. You can query it yourself for your own dashboards or integration health checks.

Metrics Endpoint

Spring Boot Actuator exposes application metrics in a format compatible with Prometheus and other monitoring systems:

bash
# List all available metric names
curl https://your-instance.finecko.com/fineract-provider/actuator/metrics \
  -H "Fineract-Platform-TenantId: default"

# Get a specific metric
curl "https://your-instance.finecko.com/fineract-provider/actuator/metrics/hikaricp.connections.active" \
  -H "Fineract-Platform-TenantId: default"

For Prometheus scraping, the metrics endpoint in Prometheus format is at:

/fineract-provider/actuator/prometheus

Key Metrics to Monitor

Connection pool (HikariCP):

MetricAlert thresholdMeaning
hikaricp.connections.active> 80% of max poolHigh connection utilisation
hikaricp.connections.pending> 0 sustainedRequests waiting for a connection
hikaricp.connections.timeoutAnyConnection pool exhaustion events
hikaricp.connections.acquireRising p99Slow connection acquisition

Connection pool exhaustion is one of the most common causes of API timeouts under load. Alert early (at 70-80% utilisation) rather than waiting for errors.

HTTP threads (Tomcat):

MetricAlert thresholdMeaning
tomcat.threads.busy> 60% of maxWarning level - rising request queue
tomcat.threads.busy> 75% of maxAction required - requests will queue

JVM:

MetricAlert thresholdMeaning
jvm.memory.used (heap)> 85% of maxRisk of GC pressure or OOM
jvm.gc.pausep99 > 500msLong GC pauses affecting latency

HTTP request performance:

MetricWhat to watch
http.server.requests (count)Request volume per endpoint
http.server.requests (max/sum)Response time trends
http.server.requests{status="5xx"}Error rate - alert on any sustained 5xx

COB Monitoring

The most operationally critical thing to monitor is whether COB completed successfully each night.

Check via API

bash
curl https://your-instance.finecko.com/fineract-provider/api/v1/jobs \
  -H "Fineract-Platform-TenantId: default" \
  -H "Authorization: Basic $(echo -n 'mifos:password' | base64)"

Filter the response for the Loan COB job and check lastRunHistory.status. Build a nightly alert that checks this after the scheduled COB window and notifies on any status other than SUCCESS.

Check via Database

For deeper inspection, query the tenant database directly:

sql
-- Last 7 COB run statuses
SELECT
  job_name,
  trigger_type,
  run_start_time,
  run_end_time,
  status,
  error_log
FROM job_run_history
WHERE job_name = 'Loan COB'
ORDER BY run_start_time DESC
LIMIT 7;
sql
-- Loans with processing errors from the last COB run
SELECT
  loan_id,
  error_message,
  created_date
FROM m_loan_cob_error
ORDER BY created_date DESC
LIMIT 50;
sql
-- Loans currently locked by COB (should be zero outside the COB window)
SELECT COUNT(*) AS locked_loan_count
FROM m_loan
WHERE locked_by_cob_job_id IS NOT NULL;

A non-zero locked_loan_count outside of the COB processing window indicates a failed or hung run. Locked loans cannot be transacted against - contact Finecko support to unlock them.

Business Date

COB advances the business date by one day on successful completion. If the business date is lagging behind the calendar date, COB has not run successfully:

sql
SELECT config_value AS business_date
FROM c_configuration
WHERE config_key = 'BUSINESS_DATE';

An alert on business_date < CURDATE() - INTERVAL 1 DAY (business date more than one day behind) is a reliable indicator of a missed COB run.

External Events Backlog

If you have business events enabled, monitor the event backlog in m_external_event. A large or growing TO_BE_SENT count indicates that the Send Async Events job is not running, or the message broker is unreachable:

sql
-- Summary of event pipeline status
SELECT status, COUNT(*) AS event_count
FROM m_external_event
GROUP BY status;

-- Oldest undelivered events (should be recent; old events indicate a stalled pipeline)
SELECT MIN(created_at) AS oldest_pending_event
FROM m_external_event
WHERE status = 'TO_BE_SENT';

Alert if oldest_pending_event is more than 15-30 minutes old during business hours. A completely stalled pipeline will have a TO_BE_SENT count that grows continuously.

Log Monitoring

Fineract logs to stdout/stderr in JSON or plain text format, depending on configuration. Key patterns to watch for in log aggregation systems (ELK, CloudWatch Logs, Datadog, etc.):

ERROR level - any ERROR log should be reviewed. High-frequency ERROR logs often indicate a configuration problem or connectivity issue.

Connection pool warnings:

HikariPool ... - Connection is not available, request timed out after ...ms

Indicates pool exhaustion. Increase FINERACT_CONFIG_MAX_POOL_SIZE or reduce concurrent load.

COB failure indicators:

Loan COB job failed
StepExecutionListener ... FAILED

Trigger a COB failure alert when these appear during the COB processing window.

Liquibase migration messages - appear at startup. A failed migration blocks startup entirely:

Liquibase 'update' Skipped
Migration of tenant ... failed

Batch partition errors:

Error in step execution for partition
Remote job message handler error

Indicate COB worker communication failures.

Checking the Scheduler

Verify that scheduled jobs are running as expected:

bash
# List all jobs with their schedules and last run status
curl https://your-instance.finecko.com/fineract-provider/api/v1/jobs \
  -H "Fineract-Platform-TenantId: default" \
  -H "Authorization: Basic $(echo -n 'mifos:password' | base64)"

Key jobs to verify are active: true with a recent lastRunHistory.status of SUCCESS:

Job namePurpose
Loan COBNightly Close of Business processing
Send Async EventsDispatches business events to the message broker
Purge External EventsRemoves old SENT events from m_external_event
Apply Annual FeePosts annual fee charges to loan accounts
Update NPAUpdates Non-Performing Asset classification

If any of these jobs show active: false, they have been accidentally disabled and need to be re-enabled via the API or the Mifos Web UI admin interface.

For Finecko managed customers who want their own observability layer:

  • Metrics: Scrape /actuator/prometheus with Prometheus, visualise in Grafana
  • Logs: Forward stdout to your log aggregation platform (ELK, Datadog, CloudWatch)
  • Alerts: Set up PagerDuty or OpsGenie rules on COB status, connection pool utilisation, and 5xx error rate
  • Uptime: Monitor /actuator/health from an external synthetic monitoring service

Contact Finecko support for details on exporting metrics and logs from your managed instance.