wiki:Hardware/Compute

Site Navigation

  1. COSMOS Testbed Overview
    1. Concepts
    1. Testbed Workflow
    1. Availability and Resource Status
    1. Events and Conferences
  2. Getting Started
    1. Make an Account
    2. Create and Configure SSH Keys
    3. Make a Reservation
    4. Log in to your Reservation
    5. Control Resources with OMF
    6. Run a Hello World Experiment
    7. Get Help and Support
  3. COSMOS/ORBIT User Guide
    1. The COSMOS Portal
    2. Connecting to the Testbed
    3. Running Experiments
    4. Policies and Support
    5. Quick Links
    1. Policies
    1. Account Creation
    1. Camera Streaming
    1. Scheduling and Reservations
    1. Disk Images
    1. Frequently Asked Questions
    1. Resource Control with OMF
  4. COSMOS Portal
    1. Your First Visit
    2. Setting Up Your Account
    3. Reserving Testbed Time
    4. Monitoring Your Experiment
    5. Connecting via SSH
    6. Managing Disk Images
    7. Joining the Community
    8. Browsing Users and Groups
    9. Tips
  5. Account Management
    1. Edit Profile
    2. Change Password
    3. SSH Keys
  6. Portal Dashboard
    1. Profile Card
    2. Usage Statistics
    3. Community Forum
  7. Directory
    1. Users
    2. Groups
    3. Privacy Note
  8. Disk Images
    1. Browsing Images
    2. Image Details
    3. Searching and Sorting
    4. Managing Your Images
    5. Baseline Images
    6. Saving Custom Images
    7. Storage and Retention
  9. Community Forum
    1. Accessing the Forum
    2. Forum Categories
    3. How to Use the Forum
    4. Forum Etiquette
    5. Privacy and Access
  10. Getting Started with the COSMOS Portal
    1. Creating an Account
    2. Logging In
    3. What to Do After Logging In
  11. SSH Access to Testbed Nodes
    1. Access Model
    2. Console Servers
    3. Basic Connection
    4. SSH Config File
    5. SSH Tunneling
    6. File Transfer
    7. Troubleshooting
  12. Scheduler
    1. Calendar View
    2. Reservation Colors
    3. Creating a Reservation
    4. Competing for a Slot
    5. Modifying or Canceling Reservations
    6. My Reservations
    7. Resource Information
  13. Testbed Status
    1. Node Status Grid
    2. RF Matrix Control (SB4)
    3. Understanding Node States During Experiments
    1. Remote Access
    1. Chrome Remote Desktop Setup Page
  14. Installing Chrome Remote Desktop (CRD) on a Custom Image
    1. Measurement & Result Collection
    1. Storage
    1. Support
    1. Contributing to the Wiki
  15. Tutorials
    1. SDR and Wireless
    2. Wireless Digital Twins
    3. Optical Networking
    4. Wired Networking
    5. Edge Computing
    6. 4G/5G Systems
    7. Orchestration Platforms
  16. Architecture
    1. Data Flow
    1. Deployment Map
    1. Domains
    1. Naming Convention
    1. Networks
    1. Optical
  17. Resources, Services and APIs
    1. RF Control
    2. SDR Control
    3. Compute Control
    4. Network Control
    5. Optical Control
  18. Datasets
  19. Hardware Info
    1. Cameras
    1. Compute
    1. FR3 SDRs
    1. Network
    1. Nodes
    1. Optical
    1. RF Subsystems
    1. Antennas
    1. Full-Duplex Radio
    1. RF Front End
    1. Software Defined Radios (SDR)
  20. RF Policies & Compliance
    1. Outdoor Radio Frequency Allocation
    2. Program Experiment License
    3. Spectrum Monitoring
    4. Emergency Stop Procedures
    5. Network and Platform Security

Compute

The dedicated compute resources are model Dell R740XD servers with the following specifications.

CPU 2x Intel Xeon 12-core
RAM 192 GB
Storage 240 GB SSD
Network 2x 25 GB NICs
FPGA Xilinx Alveo U200 Data Center Accelerator with 2x 100G ports
GPU Nvidia Tesla V100 16GB VRAM (domains: bed, sb1, sb2) or Nvidia Tesla A100 40GB VRAM (domains: weeks)

Block Diagram

This block diagram shows the bus connections of the various processing devices hosted by these servers.

For best performance, you must take into account the data path, from source to sink, as well as processing and memory locality.

GPU

These servers each have an Nvidia GPU (V100 16GB or A100 40GB), for CUDA processing.

FPGA

These servers each have a Xilinx Alveo U200 FPGA, with two 100g qsfp28 ports. The FPGA can be configured either as an additional NIC, or as a standalone computing device.

In an advanced use case, you could use the FPGA to pre-process radio data, and then send over PCIe or the network to a GPU / CPU for further processing.

Network

In addition to the GPU and FPGA, the server has a two (2) port 25g ethernet NIC for data-plane traffic.

The 10G NIC is dedicated to control and provisioning traffic, and should not be relied upon for experimental repeatability.

Last modified 3 years ago Last modified on Apr 4, 2023, 7:33:13 PM

Attachments (1)

Download all attachments as: .zip

Note: See TracWiki for help on using the wiki.