Remote developer environments
Problem Statement
Imagine you have a team of 10 developers working on large machine learning (ML) projects. Each developer requires substantial compute resources to train and run their models. As an organization, you can either give each person their own powerful setup or create one central resource that everyone shares. The second option—setting up a shared resource—is usually more cost-effective and allows better resource utilization, since developers can benefit from pooled hardware without each having to purchase a separate high-end machine.
Solution
Remote Development is a strategy where a local machine serves primarily as an interface (like a window), while all the heavy tasks run on a powerful shared server. This setup is typically implemented using a secure connection method, such as SSH.
By using remote development, your team focuses on coding while the central server handles time-consuming ML operations. It also decreases the configuration overhead on individual machines and centralizes data, which can simplify collaboration and data security
Basic Concepts: Networking
SSH is an application protocol in the TCP/IP stack. In simple terms, its an agreed mode of communication.
In the network hierachy, here is how it looks like
1. Application Layer (HTTP, SSH, FTP)
2. Transport Layer (TCP, UDP)
3. Network Layer (IP, ICMP)
4. Network Access Layer (Ethernet, WiFi)
Think of it like a shipping company:
- Application Layer: What you want to send (letter, package)
- Transport Layer: How it should be delivered (express, standard)
- Network Layer: The routing and addressing (which cities/hubs to go through)
- Network Access Layer: The actual vehicles and roads used
If you've worked in the Linux Terminal, then you've probably touched these different layers at some point:
# Network troubleshooting commands use TCP/IP terms
curl https://api.github.com # Application Layer (HTTPS)
netstat -t # Transport Layer (TCP connections)
ping 192.168.1.1 # Internet Layer (IP)
tcpdump -i eth0 # Network Access Layer (capturing on ethernet)
Basic Concepts: SSH
What happens when you type SSH <user>@<ip>
When you send a SSH Command, the following happens:
Your SSH Client
│
▼
SSH Protocol (Encrypts data)
│
▼
TCP (Ensures reliable delivery)
│
▼
IP (Routes packets to destination)
│
▼
Network Interface (Physical transmission)
SSH over a LAN
Think of the early days of LAN parties, where computers connected through a local network. In a local area network, devices communicate with each other directly through a router or a network switch.
• Router or Switch: Acts like a traffic junction directing data to each device.
• Identifiers: Computers talk to each other using IP addresses (e.g., 192.168.x.x).
Sequence
- Computer A obtains the IP of Computer B from the local network.
- User on Computer A runs ssh user@.
- Router or switch routes the packets to Computer B directly.
- SSH authenticates and establishes the secure session if the credentials are correct.
This is simple because both devices have stable, known IP addresses. They are also no firewalls between them.
SSH over the Internet
When connecting over the internet, the concept is similar but there are additional layers and problems. Let's trace the path: ![[Illustrated Guide to Remote Development-20250211133604484.jpg]]
At each node, there are a few abstractions and indirection to manage:
Your Laptop Layer
- Has only a private IP address (192.168.1.100)
- Cannot be directly reached from the internet
- Private IP address can change each time you connect (via DHCP)
- Needs to know the public IP of the destination
Home Router (NAT) Layer
- Must maintain a NAT table of connections
- Public IP may change (ISP assigns dynamically)
- Blocks incoming connections by default
- Needs port forwarding rules for incoming connections
Your ISP Layer
- May block certain ports (especially port 22 for SSH)
- Might use carrier-grade NAT (double NAT)
- Can throttle or shape traffic
- May have unstable routing
Internet Backbone Layer
- Variable latency
- Possible packet loss
- Route changes
- Multiple hops between networks
Their ISP Layer
- Similar issues to Your ISP
- Different routing policies
- Different port restrictions
- Possibly different quality of service
Company Router Layer
- Corporate firewall rules
- May block incoming SSH
- Needs explicit port forwarding
- Access control lists (ACLs)
Remote Server Layer
- Only knows its private IP
- Cannot initiate connections to your laptop
- Needs firewall rules configured
- SSH daemon must be properly configured
As you can see, there are many intermediatry steps. And still the main goal stands: how can a server retain a persistant IP address?
There are 3 fundamental approaches to giving a server a persistent address:
- Classic, Static IP
- Reverse Tunnel (lets make a proxy)
- VPN (forget the internet, treat it like a private internet)
Solution: Persistent Server Addressing
1. Static IP / DDNS Approach
Concept: The traditional approach - either pin your IP address (Static) or keep updating a DNS record (DDNS) when your IP changes. Like having a fixed postal address or a mail forwarding service.
2. Reverse Tunnels
Concept: Instead of clients connecting directly to your server, a trusted middle service (like ngrok or Cloudflare Tunnels) maintains a tunnel to your server. Like having a P.O. Box at the post office - mail goes to the post office first, then to you.
This is particularly popular in development environments. For example, when testing webhook deliveries from GitHub to a local server, or when showing a client a work-in-progress website running on your laptop. Services like Gitpod and GitHub Codespaces use similar technology to expose development ports.