[codex] integrate vivaldi latency state#61
Open
miciav wants to merge 63 commits into
Open
Conversation
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…trategies Wire faasprovider.FaaSProvider into RecalcStrategy, NodeMarginStrategy, StaticStrategy, and AllLocalStrategy; use faasprovider.NewFaaSProvider() in all four strategy factory methods, removing direct offuncs/ofpromq deps. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…dcoded OpenFaaS endpoint Create a FaaSProvider once in agent.go and pass it to httpserver.Initialize(), replacing the hardcoded healthCheckOpenFaaS() private function with a call to _faasProvider.HealthCheck() so the health check is platform-agnostic. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Implements faasprovider.FaaSProvider for Apache OpenWhisk by adding
openwhisk.Client that fetches action metadata (name, dfaas.maxrate,
dfaas.timeout_ms) from the /api/v1/namespaces/{ns}/actions endpoint.
Wires the new client into the factory so AGENT_FAAS_PLATFORM=openwhisk
is now functional. Prometheus query methods are stubbed for Task 7.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Change owAnnotation.Value from string to json.RawMessage so that non-string annotation values (objects, booleans, numbers) unmarshal correctly; update annotation() helper to unquote JSON strings or fall back to the raw representation - Replace deprecated ioutil.ReadAll with io.ReadAll (Go 1.16+) - Fix HealthCheck() to send the Authorization header via http.NewRequest instead of the bare http.Get call that always fails on secured deployments - Add an httpClient field (30 s timeout) to Client; use it in both doActionsRequest and HealthCheck instead of http.DefaultClient - Add TestNewFaaSProvider_OpenWhisk to factory_test.go - Update mockOpenWhiskServer to reject requests to unexpected URL paths Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Add prometheusHost field to Client struct and NewWithPrometheus constructor
so tests can inject a mock Prometheus server without touching production config
- Add promQuery helper that uses c.httpClient (not DefaultClient) with proper
body draining and io.ReadAll (no ioutil)
- Implement QueryAFET using openwhisk_action_duration_seconds_sum/count with
the "action" label instead of OpenFaaS "function_name"
- Implement QueryInvoc using openwhisk_action_activations_total; map OpenWhisk
"status" label ("success" -> "200", anything else -> "500")
- Implement QueryServiceCount using kube_deployment_status_replicas filtered
by the client namespace, keyed by "deployment" label
- Implement QueryCPUusage and QueryRAMusage using identical node-exporter PromQL
as ofpromq (node-level metrics are platform-independent)
- Implement QueryCPUusagePerFunction and QueryRAMusagePerFunction using
cAdvisor container_* metrics, keyed by "container" label
- Add promtypes.go with typed response structs for all five query shapes
- Add TDD tests for all seven Query* methods via mockPrometheusServer helper
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…OpenWhisk promquery Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…templates Rename the OpenFaaSHost and OpenFaaSPort fields in the HACfg base struct and all derived structs/templates to FaaSHost and FaaSPort, making the internal naming platform-agnostic. Config struct fields (used for env var mappings) are intentionally left unchanged. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ath rewrite Adds BackendPathPrefix helper in faasprovider, FaaSBackendPath field to HACfg, and http-request replace-path rule in all four HAProxy templates so /function/<name> is transparently rewritten to the platform-specific path (no-op for OpenFaaS, /api/v1/namespaces/<ns>/actions/<name> for OpenWhisk). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Add http-request replace-path to per-function be_{{funcName}} backends
in haproxycfgnms.tmpl and haproxycfgstatic.tmpl so OpenWhisk receives
the correct /api/v1/namespaces/<ns>/actions/<name> path for locally-
weighted direct-client requests, not just node-forwarded ones.
- Rename AGENT_OPENFAAS_HOST/PORT → AGENT_FAAS_HOST/AGENT_FAAS_PORT
across config.go, agent.go, strategyfactory.go, all four strategy
files, values.yaml, values-openwhisk.yaml, and docs/commands.md.
The old names implied OpenFaaS even when OpenWhisk was configured.
- Escape function names with regexp.QuoteMeta before building the
container=~ regex filter in QueryCPUusagePerFunction and
QueryRAMusagePerFunction to prevent PromQL injection when action
names contain regex metacharacters (e.g. dots in package names).
- Remove unreachable dead code in staticstrategy.publishNodeInfo()
(duplicate GetFuncsNames call after return nil).
- Remove CLAUDE.md from .gitignore (the file is tracked; the entry
was a no-op and only caused confusion).
- Pin OpenWhisk Helm chart to --version 1.0.1 in docs/commands.md.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Remove e2e_render_control_plane_sync_env, e2e_deploy_control_plane, e2e_deploy_function_runtime, e2e_verify_core_pods_running, e2e_verify_control_plane_health, and e2e_dump_core_pod_logs — all NanoFaaS-specific helpers that have no place in the generic DFaaS e2e library. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Delete e2e_register_pool_function, e2e_kubectl_curl_control_plane, e2e_extract_json_by_field, e2e_extract_execution_id, e2e_extract_execution_status, e2e_extract_bool_field, e2e_extract_numeric_field, e2e_invoke_sync_message, e2e_enqueue_message, e2e_fetch_execution, e2e_wait_execution_success, e2e_enqueue_message_burst, e2e_get_control_plane_pod_name, and e2e_fetch_control_plane_prometheus. These were NanoFaaS-specific helpers with no role in the DFaaS e2e suite. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Introduces MakeCommonCallback in agent/commondispatch.go, which wraps any strategy's OnReceived callback with a pre-filter that intercepts common broadcast messages (heartbeat, overload_alert, function_event) and updates the CommonNodeTable before delegating to the strategy callback. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Initialize DirectMessenger (with configurable timeout and 5s fallback) before the load balancer, create a CommonNodeTable with a TTL of 3× HeartbeatInterval (30s fallback), and wrap the strategy callback with MakeCommonCallback so all incoming PubSub messages update the shared table before being forwarded to the active strategy. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Move the duplicated msgTypeEnvelope struct to msgtypes.MsgEnvelope and CommonMsgTypes to msgtypes.CommonBroadcastTypes so both commondispatch and communication/direct share a single canonical definition. Also combine the two Write syscalls in writeMsg into a single buffered write. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Describes the interface hierarchy (PeriodicStrategy, EventDrivenStrategy, HybridStrategy), the runner dispatcher, event flow, concurrency guarantees, and migration path for existing strategies. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
10-task TDD plan covering: interface definitions, three runner implementations (periodic/event-driven/hybrid), strategy migrations, and agent.go wiring with context-based shutdown. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Adds periodicRunner that drives a PeriodicStrategy on a fixed ticker, with context-cancellation support and a 1-minute default period fallback. NewRunner dispatches to the correct runner via type switch (HybridStrategy > EventDrivenStrategy > PeriodicStrategy). Placeholder runners (noopRunner, hybrid-as-periodic) are clearly marked for Tasks 3/4. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Replaces the noopRunner placeholder with a real eventDrivenRunner that forwards trigger events to a worker goroutine via a capacity-1 channel, collapses bursts with an optional debounce window, and delegates all pubsub messages to OnReceived. Also fixes the test mock counters to use sync/atomic.Int32 so the race detector passes cleanly. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Replace RunStrategy() self-managed loops in RecalcStrategy, StaticStrategy, AllLocalStrategy, and NodeMarginStrategy with Period() and Tick(ctx) methods. Move AllLocalStrategy.prevFuncs and NodeMarginStrategy pre-loop init (maxValues thresholds, nodeInfo fields) to their respective factory createStrategy(). Add compile-time interface checks (var _ PeriodicStrategy = ...). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- makeTrigSet: replaces identical trigSet construction in eventdriven/hybrid runners - makeTriggerCallback: replaces identical Callback() body in both runners - effectivePeriod: replaces period-defaulting duplication in periodic/hybrid runners - sleepOrCtx: replaces three identical context-aware select blocks in RecalcStrategy.Tick - Stop debounce timer on ctx.Done() in eventDrivenRunner for proper cleanup Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…models Adds docs/paper/ with two IEEEtran-formatted sections: - Section III: Messaging Subsystem (two-plane architecture, common message vocabulary, GossipSub broadcast, CommonNodeTable, directed libp2p streams) - Section IV: Strategy Execution Models (interface hierarchy, three runner implementations, context propagation, migration of existing strategies) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
8088992 to
206e902
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Why
This adds latency-awareness infrastructure to the DFaaS agent while keeping the current Kademlia/mDNS discovery flow and existing strategies unchanged. It prepares the agent for future latency-aware forwarding decisions without introducing a second membership system.
Test Plan
go test ./...go test -cover ./agent/latency/vivaldigo test -cover ./agent/latency/vivaldi ./agent/msgtypes ./agent/nodestbl ./agent