= OMF Services Deployment Topology = '''Last updated: 2026-03-13''' [[PageOutline(2-3)]] == 1. Overview == The ORBIT/COSMOS testbed infrastructure spans two sites connected via IP tunnel: * '''North Brunswick, NJ (Rutgers)''' -- Primary ORBIT site * '''New York City, NY (Columbia)''' -- Primary COSMOS site The platform runs '''19 Ruby/Sinatra microservices''', '''1 Python/Flask service''' (omf-array-mgmt), and '''1 CLI tool''' (omf-expctl) across '''23+ hosts''' with '''35+ service instances'''. All Ruby services share the {{{omf-common}}} git submodule (Sinatra base class with DSL for route definition, XML/JSON response formatting, Prometheus metrics, and configuration loading). Services are packaged as Debian packages via FPM and managed by systemd. The canonical entry point for all service API calls is the '''AM proxy''' at {{{am1:5054}}} ({{{omf-agg-mgr-proxy}}}), which provides service discovery and request routing. The '''cosmos-portal''' (React SPA on web1) provides the web UI, with Apache reverse-proxying API calls to backend services. == 2. Network Architecture == === IP Address Ranges === ||= Site =||= Range =||= Notes =|| || North Brunswick (Rutgers) || {{{10.0.0.0 -- 10.63.255.255}}} || Primary ORBIT infrastructure || || New York City (Columbia) || {{{10.64.0.0 -- 10.127.255.255}}} || Primary COSMOS infrastructure || === VLAN Structure === * '''Management VLAN''' -- Infrastructure servers, out-of-band management * '''Control VLAN''' (per domain) -- Service-to-node communication, PXE boot * '''Data VLAN''' (per domain) -- Experiment data traffic between testbed nodes * '''IP tunnel''' connecting North Brunswick and NYC sites === DNS Domains === * {{{orbit-lab.org}}} -- ORBIT nodes and infrastructure * {{{cosmos-lab.org}}} -- COSMOS nodes and infrastructure DNS is served by BIND9 on mgmt1/mgmt2 with 2,563 forward A records across 25 zones (11 ORBIT + 14 COSMOS) and 53 reverse zones. == 3. North Brunswick Hosts == === am1 (10.50.0.41) -- Aggregate Manager / RF Services === ||= Service =||= Port =||= Version =||= Description =|| || omf-agg-mgr-proxy || 5054 || -- || Service discovery proxy (routes API calls to backend services) || || omf-rf-control || 5001 || v0-2 || RF signal generator control || || omf-rf-switch || 5002 || v0-2 || RF switch matrix control || || omf-xy-table || 5003 || -- || XY table positioning service || || omf-array-mgmt || 5004 || -- || Antenna array management ('''Python/Flask''') || '''Platform:''' Ubuntu 18.04, RVM Ruby 3.2.3, Python 3.6.9[[BR]] '''Note:''' Requires {{{libruby-3.2}}} symlink + ldconfig for native extensions === am4 -- Development Server === Development machine only. All source repos at {{{/home/seskar/omf-*}}}. Not a production service host. === am5 (10.50.0.45) -- Core Services === ||= Service =||= Port =||= Version =||= Description =|| || omf-cmc || 5013 || v1-8 || Chassis management controller (power control via IPMI, HTTP CM, SNMP PDU) || || omf-scheduler || 5016 || -- || Reservation scheduler and auto-approver (ActiveRecord + MySQL) || || omf-rfmatrix || 5020 || v1-1 || RF matrix switch control || || omf-status || 5021 || -- || Testbed status aggregation || === repository2 (10.50.0.22) -- Image & Account Services === ||= Service =||= Port =||= Version =||= Description =|| || omf-account-mgmt || 5017 || v1-2 || User/group registration, approval, LDAP lifecycle (ActiveRecord + MySQL) || || omf-frisbee || 5011 || v1-3 || Frisbee daemon management for disk image multicasting || || omf-pxe || 5010 || v1-5 || PXE boot configuration (aggregate manager) || || omf-saveimage || 5012 || v1-3 || Disk image save via netcat receiver || || omf-user-stats || 5015 || -- || User disk usage and scheduler usage statistics || === Infrastructure Servers === ||= Host =||= IP =||= Role =|| || mgmt1 || 10.250.0.8 || Primary DHCP (ISC DHCP4, 2,145 static hosts) + Primary DNS (BIND9, 2,563 A records) || || mgmt2 || 10.250.0.9 || DHCP failover peer + DNS slave || || db1 || 10.0.0.51 || LibreNMS monitoring (190 devices, SNMP polling every 5 min) || || mysql1 || -- || Shared MySQL server for scheduler, account-mgmt, user-stats || || amqp.orbit-lab.org || -- || RabbitMQ MQTT broker (MQTT 1883, WebSocket 15675) || || web1 || -- || cosmos-portal (React SPA), Apache reverse proxy || || gitlab.orbit-lab.org || 10.50.0.20 || GitLab (24 repos under {{{orbit/}}}) || || ldap1.orbit-lab.org || -- || Primary OpenLDAP server (port 389) || || ldap2.orbit-lab.org || -- || Secondary OpenLDAP server (port 389) || === ORBIT Console Servers (9 hosts) === All consoles run '''omf-cmonitor''' on port 5000. Some also run '''omf-expctl'''.[[BR]] '''Platform:''' Ubuntu 16.04, RVM Ruby 3.2.3, omf-cmonitor v1-1 ||= Console =||= omf-cmonitor =||= omf-expctl =||= Notes =|| || grid.orbit-lab.org || Yes || Yes (v1-19) || Main 20x20 grid || || sb1.orbit-lab.org || Yes || Yes (v1-19) || Sandbox 1 || || sb2.orbit-lab.org || Yes || Yes (v1-19) || Sandbox 2 || || sb3.orbit-lab.org || Yes || Yes (v1-19) || Sandbox 3 || || sb4.orbit-lab.org || Yes || -- || Sandbox 4 || || sb7.orbit-lab.org || Yes || -- || Sandbox 7 || || sb9.orbit-lab.org || Yes || -- || Sandbox 9 || || outdoor.orbit-lab.org || Yes || -- || Outdoor testbed || || instrument.orbit-lab.org || Yes || -- || Instrument cluster || === Unreachable Consoles === The following consoles are currently unreachable: {{{vgrid1-4.orbit-lab.org}}}, {{{instrument.cosmos-lab.org}}} == 4. New York City Hosts == === COSMOS Console Servers (9 hosts) === All consoles run '''omf-cmonitor''' on port 5000.[[BR]] '''Platform:''' Ubuntu 16.04, RVM Ruby 3.2.3, omf-cmonitor v1-1 ||= Console =||= omf-cmonitor =||= omf-expctl =||= Notes =|| || osc.cosmos-lab.org || Yes || -- || Open-access sandbox || || indigo.cosmos-lab.org || Yes || -- || || || accord.cosmos-lab.org || Yes || -- || || || sb1.cosmos-lab.org || Yes || Yes (v1-19) || Sandbox 1 || || sb2.cosmos-lab.org || Yes || -- || Sandbox 2 || || weeks.cosmos-lab.org || Yes || -- || || || rrail.cosmos-lab.org || Yes || -- || || || bed.cosmos-lab.org || Yes || -- || || || nebula.cosmos-lab.org || Yes || -- || || === COSMOS Raspberry Pis === ||= Host =||= IP =||= Services =|| || pi1-auden.sb1.cosmos-lab.org || 10.37.25.15 || omf-cosmos-cm (5018, v1-1), omf-auden (5019, v1-1) || || pi2-auden.sb1.cosmos-lab.org || 10.37.25.16 || omf-cosmos-cm (5018, v1-1), omf-auden (5019, v1-1) || === XY Table Controllers === ||= Host =||= IP =||= Service =||= Notes =|| || xytable1 || 10.1.37.221 || omf-xytable-ctrl (port 80) || Raspberry Pi, sb1.cosmos-lab.org || || xytable2 || 10.1.37.222 || omf-xytable-ctrl (port 80) || Raspberry Pi, sb1.cosmos-lab.org || MQTT telemetry published to {{{xy//position}}} every 200ms via {{{amqp.orbit-lab.org:1883}}}. == 5. Shared Infrastructure == ||= Component =||= Host =||= Details =|| || cosmos-portal || web1 || React 18 + Vite SPA, Tailwind CSS, static files served by Apache || || GitLab || gitlab.orbit-lab.org (10.50.0.20) || 24 repos under {{{orbit/}}} namespace || || NetBox || 10.50.0.93 || v2.9.10 (needs upgrade to 4.x), 290 devices, data from 2020-2021 || || Proxmox (ORBIT) || mgmt-vmhost1..5 || 5x Dell R740 (48 cores, 187GB each), 98 VMs (57 running), Ceph RBD + NFS || || Proxmox (COSMOS) || mgmt-vmhost1..5-co1 || 3x R430 + 2x R740, 13 VMs (8 running) || == 6. Service Dependency Map == This section documents which services call which other services. === omf-expctl (Experiment Controller CLI) === All calls routed via the AM proxy at {{{am1:5054}}}. * {{{omf-expctl}}} -> {{{omf-cmc}}} -- Power control (on/off/reset nodes) * {{{omf-expctl}}} -> {{{omf-pxe}}} -- PXE boot setup (set boot image) * {{{omf-expctl}}} -> {{{omf-frisbee}}} -- Disk imaging (load images onto nodes) * {{{omf-expctl}}} -> {{{omf-saveimage}}} -- Save disk images from nodes * {{{omf-expctl}}} -> {{{omf-scheduler}}} -- Permission/reservation check === Inter-Service Dependencies === ||= Caller =||= Callee =||= Purpose =||= Via =|| || omf-cmc || omf-cmonitor || Wake-on-LAN packet generation || Direct HTTP (CM_wolurl) || || omf-auden || omf-rf-control || RF signal generator setup || Direct (should use AM proxy) || || omf-status || omf-cmc || Node power state || Direct || || omf-status || omf-scheduler || Reservation info || Direct || || omf-status || omf-frisbee || Imaging status || Direct || === External Dependencies === ||= Service =||= External System =||= Purpose =|| || omf-scheduler || LDAP (ldap1/ldap2) || Host attribute management (LdapHostManager) || || omf-scheduler || MySQL || Reservation persistence || || omf-scheduler || SMTP (mail.orbit-lab.org:25) || Reservation notifications || || omf-account-mgmt || LDAP (ldap1/ldap2) || User/group lifecycle management || || omf-account-mgmt || MySQL || Account persistence || || omf-account-mgmt || SMTP (mail.orbit-lab.org:25) || Account notifications || || omf-user-stats || MySQL (multiple databases) || Usage data aggregation || || omf-user-stats || LDAP || User lookups || === cosmos-portal (Web UI) === All API calls proxied via Apache on web1: ||= Portal Route =||= Backend =||= Service =|| || {{{/account/*}}} || repository2:5017 || omf-account-mgmt || || {{{/scheduler/*}}} || am5:5016 || omf-scheduler || || {{{/rfmatrix/*}}} || am5:5020 || omf-rfmatrix || || {{{/status/*}}} || am5:5021 || omf-status || || {{{/inventory/*}}} || am5:5012 || omf-newinventory (legacy) || || {{{/user-stats/*}}} || repository2:5015 || omf-user-stats || || {{{/mqtt/ws}}} || amqp.orbit-lab.org:15675/ws || RabbitMQ WebSocket || == 7. Port Registry == ||= Port =||= Service =||= Deployment =|| || 5000 || omf-cmonitor || Console servers (18 hosts, per domain) || || 5001 || omf-rf-control || am1 || || 5002 || omf-rf-switch || am1 || || 5003 || omf-xy-table || am1 || || 5004 || omf-array-mgmt || am1 || || 5010 || omf-pxe || repository2 || || 5011 || omf-frisbee || repository2 || || 5012 || omf-saveimage || repository2 || || 5013 || omf-cmc || am5 || || 5015 || omf-user-stats || repository2 || || 5016 || omf-scheduler || am5 || || 5017 || omf-account-mgmt || repository2 || || 5018 || omf-cosmos-cm || COSMOS Pis || || 5019 || omf-auden || COSMOS Pis || || 5020 || omf-rfmatrix || am5 || || 5021 || omf-status || am5 || || 5054 || omf-agg-mgr-proxy || am1 || '''Next available port: 5022''' == 8. Technology Stack == === Backend === * '''Ruby 3.2.x''' -- Primary language for all microservices * '''Sinatra''' (v4.x) -- Web framework (all services inherit from {{{OMFService}}} base class) * '''Puma''' -- Application server * '''ActiveRecord''' -- ORM for MySQL-backed services (scheduler, account-mgmt, user-stats) * '''Ox''' -- Fast XML parser/generator for OMF XML responses * '''Python 3.x / Flask''' -- omf-array-mgmt only * '''sinatra-param''' -- Request parameter validation via DSL === Data Stores === * '''MySQL / MariaDB''' -- Persistence for scheduler, account-mgmt, user-stats, rfmatrix * '''OpenLDAP''' -- User/group directory (ldap1/ldap2, port 389) * '''RabbitMQ''' -- MQTT broker for node communication and XY table telemetry === Frontend === * '''React 18''' -- cosmos-portal SPA * '''Vite''' -- Build tool * '''Tailwind CSS''' -- Styling * '''Apache''' -- Static file serving + reverse proxy on web1 === Operations === * '''Debian packaging''' via FPM ({{{make deb}}}) * '''systemd''' service units (auto-enabled on package install) * '''Prometheus''' metrics at {{{/metrics}}} on all services * '''LibreNMS''' -- Network device monitoring (190 devices via SNMP) * '''Git submodules''' -- {{{omf-common}}} (shared framework), {{{omf-logging-db}}} (database helpers), {{{omf-ldap}}} (LDAP helpers) === Configuration === Configuration is merged in order (later files override earlier): 1. {{{default/config.yml}}} -- Built-in defaults (shipped with package) 2. {{{/etc/omf-services/config.yml}}} -- Global settings 3. {{{/etc/omf-services/.yml}}} -- Service-specific (e.g., {{{cmonitor.yml}}}) 4. {{{./config.yml}}} -- Development override (not installed in production) === PXE Boot Images === * '''omf-5.8''' (current) -- Alpine 3.23, kernel 6.18 LTS, Ruby 3.4.8, MQTT-based RC, 97MB initfs * '''omf-5.7''' (legacy) -- Alpine LTS 5.15, Ruby 3.1, XMPP-based RC, 41MB initfs * PXE images served from {{{root@repository2:/tftpboot/}}}