Scaling and Availability

The Scaling Model

PCF enables both vertical and horizontal scaling, with horizontal being the recommended approach.

Horizontal Scaling: Instances

Run multiple copies of your application:

```bash

Scale to 5 instances

$ cf scale my-app -i 5

Gorouter load balances across 5 running instances

Each handles 1/5th of traffic

If one fails, 4 still serving users

```

Vertical Scaling: Memory

Allocate more resources per instance:

```bash

Give each instance 2GB memory

$ cf scale my-app -m 2G

Each instance has more memory

Better for memory-intensive apps

But can't scale beyond pod/node limit

```

Best practice: Horizontal over vertical. Scale instances, not memory.

Auto-Scaling with Policies

PCF can automatically scale applications based on metrics:

```bash

Enable autoscaling

$ cf create-app-autoscaler-policy my-app { "instance_min_count": 2, "instance_max_count": 10, "scaling_rules": [ { "metric_type": "cpu", "threshold": 80, "operator": ">", "cool_down_seconds": 60, "adjustment": "+1" }, { "metric_type": "memory", "threshold": 90, "operator": ">", "cool_down_seconds": 60, "adjustment": "+2" } ] } ```

Scaling rules: - When CPU > 80% for 60 seconds → add 1 instance - When memory > 90% for 60 seconds → add 2 instances - Minimum 2 instances, maximum 10 instances

Health Checks and Self-Healing

PCF automatically manages instance health.

Three Health Check Types

1. Health Endpoint (Recommended)

Application exposes endpoint that indicates health:

bash $ cf push my-app --health-check-type http --health-check-http-endpoint /health

Your application: javascript app.get('/health', (req, res) => { // Check dependencies if needed res.status(200).json({ status: 'healthy' }); });

PCF checks endpoint every 30 seconds. Failing response → restart instance.

2. Port Health Check

PCF simply checks if port is listening:

bash $ cf push my-app --health-check-type port

If app stops listening on $PORT → restart.

3. Process Health Check

PCF verifies app process is running:

bash $ cf push my-app --health-check-type process

If process exits → restart.

Health Check Configuration

```bash

Specify endpoint and timeout

$ cf push my-app \ --health-check-type http \ --health-check-http-endpoint /status \ --health-check-invocation-timeout 5 \ --health-check-interval 30

Health check fails if:

- Response not 200-299 status

- Timeout > 5 seconds

- Process crashes

```

Failure Cascade

1. App instance running 2. Health check fails 3. Diego marks instance unhealthy 4. Gorouter removes from routing 5. Diego restarts instance 6. Health check succeeds 7. Gorouter readds to routing 8. Traffic flows again

Entire process typically < 60 seconds.

Availability Patterns

1. Multi-Instance Deployment

bash $ cf scale my-app -i 3 my-app now has 3 instances running

Benefit: - One instance fails → 2/3 still serving users - Gorouter removes failed instance immediately - Diego automatically restarts failed instance

2. Geographical Redundancy

Deploy to multiple availability zones:

``` ┌─ Zone A ─────────────────┐ │ my-app instance 0 │ │ my-app instance 1 │ └──────────────────────────┘

┌─ Zone B ─────────────────┐ │ my-app instance 2 │ │ my-app instance 3 │ └──────────────────────────┘

Load balancer / DNS points to both zones ```

3. Graceful Shutdown

Best practice: handle SIGTERM and drain connections:

```javascript // Node.js example process.on('SIGTERM', () => { console.log('SIGTERM received, shutting down gracefully');

// Stop accepting new requests server.close(() => { console.log('Server closed'); process.exit(0); });

// Wait 30 seconds for existing requests setTimeout(() => { console.log('Forced shutdown'); process.exit(1); }, 30000); }); ```

When Diego stops instance: 1. Sends SIGTERM 2. App has ~30 seconds to finish requests 3. If not shutdown, SIGKILL is sent 4. Gorouter already removed instance from routing

4. Circuit Breaker Pattern

Prevent cascading failures:

```javascript // Pseudo-code: Circuit breaker for external API const client = new CircuitBreaker( () => fetch('https://external-api.com/data'), { timeout: 3000, errorThresholdPercentage: 50 } );

try { const data = await client.fire(); } catch (err) { // Return cached data or error message res.json({ data: cachedData }); } ```

Resource Management

Quotas: Org Level

Limit resources per organization:

```bash

Quotas

$ cf create-quota large \ --memory 100G \ --routes 1000 \ --service-instances 50

$ cf set-quota my-org large ```

Org gets max: - 100GB total memory across all apps - 1000 routes - 50 service instances

Quotas: Space Level

Further restrict within space:

```bash $ cf create-space-quota dev-limited --memory 10G

$ cf set-space-quota dev dev-limited ```

Space can only use 10GB even if org has 100GB.

Instance Sizing Strategy

Instance Type	Best For	Memory
Micro	Static sites, APIs	256MB
Small	Simple apps	512MB
Medium	Standard apps	1GB
Large	Memory-intensive	2-4GB
XL	Big processes	8GB+

Scale instances horizontally, not vertically.

Monitoring Availability

Viewing Instances

```bash $ cf app my-app name: my-app state: started instances: 3/3

instance state since cpu memory 0 running 2024-04-28 14:22:33 25.1% 256M of 512M 1 running 2024-04-28 14:22:35 18.5% 245M of 512M 2 running 2024-04-28 14:22:34 22.3% 251M of 512M ```

Monitoring Logs

```bash

Tail logs in real-time

$ cf logs my-app

View recent logs

$ cf logs my-app --recent

Logs from all instances aggregated

Marked with instance index: [APP/instance]

```

Events

bash $ cf events my-app time type actor description 2024-04-28T14:22:00.000Z audit.app.update user@example.com state changed to started 2024-04-28T14:22:05.000Z app.crash index 0 app instance crashed 2024-04-28T14:22:07.000Z app.update diego app instance added

Next: Local Setup