Link

Veeam VMware vSphere Backup Proxy

Choosing the right Veeam proxy server design for your environment gives you much control over the impact on the vSphere infrastructure and the backup traffic flow. Proxies are the work horses and are critical components to achieve good backup and restore speeds.

When thinking about proxy design you have to be familiar with the different Transport Modes to understand limitations, requirements, etc. for proxy placement and design.

Proxy Placement

Based on your chosen transport mode you might require hot-add proxies (Virtual Appliance Mode) or physical proxies (Direct SAN/Backup from Storage Snapshots via FC).

As a general rule the proxy should be as close to the source data as possible with a high bandwidth connection. The traffic from the source to the proxy is not yet optimized, meaning that 100% of the backup data will be transferred over this link.

Also consider a good connection between proxy and repository. Optimized data (normally ~50% of the source data size) will be transferred here.

Proxy OS requirements

We recommend the latest supported version of Windows Server OS for all proxies.

Proxy Sizing

Getting the right amount of processing power is essential to achieving the RTO/RPO defined by the business. In this section, we will outline the recommendations to follow for appropriate sizing.

Processing Resources

Proxies do have multiple task slots to process VM source data. It is best practice to plan for 1 physical core or 1 vCPU and 2 GB of RAM for each of these tasks.

A task processes 1 VM disk at a time and CPU/RAM resources are used for inline data deduplication, compression, encryption and other features that are running on the proxy itself.

In the User Guide it is stated that proxy servers require 2 GB RAM + 500 MB per task. Please consider these values as minimum requirements. Using the above mentioned recommendations allow for growth and additional inline processing features or other special job settings that increase RAM consumption.

If the proxy is used for other roles like Gateway Server for SMB shares, EMC DataDomain DDBoost, HPE StoreOnce Catalyst or if you run the backup repository on the server, remember stacking system requirements for all the different components. Please see related chapters for each components for further details.

Tip: Doubling the proxy server task count will - in general - reduce the backup window by 2x.

Calculating required proxy tasks

Depending on the infrastructure and source storage performance, these numbers may turn out being too conservative. We recommend performing a POC to examine the specific numbers for the environment.

D = Source data in MB
W = Backup window in seconds
T = Throughput in MB/s = D/W
CR = Change rate
CF = Cores required for full backup = T/100
CI = Cores required for incremental backup = (T * CR)/25

Example

Our sample infrastructure has the following characteristics:

  • 1,000 VMs
  • 100 TB of consumed storage
  • 8 hours backup window
  • 10% change rate

By inserting these numbers into the equations above, we get the following results.

D = 100 TB * 1024 * 1024 = 104,857,600 MB
W = 8 hours * 3600 seconds = 28,800 seconds
T = 104857600/28800 = 3,641 MB/s

We use the average throughput to predict how many cores are required to meet the defined SLA.

CF = T/100 ~ 36 cores

The equation is modified to account for decreased performance for incremental backups in the following result:

CI = (T * CR)/25 ~ 14 cores

As seen above, incremental backups typically have lower compute requirements on the proxy servers.

Considering each task consumes up to 2 GB RAM, we get the following result:

36 cores and 72 GB RAM

  • For a physical server, it is recommended to install dual CPUs with 10 cores each. 2 physical servers are required.
  • For virtual proxy servers, it is recommended to configure multiple proxies with maximum 8 vCPUs to avoid co-stop scheduling issues. 5 virtual proxy servers are required.

If we instead size only for incremental backups rather than full backups, we can predict alternative full backup window with less compute:

WS = 104857600/(14 * 100)
W = WS/3600 ~ 21 hours

If the business can accept this increased backup window for periodical full backups, it is possible to lower the compute requirement by more than 2x and get the following result:

14 cores and 28 GB RAM

  • For a physical server, it is recommended to install dual CPUs with 10 cores each. 1 physical server is required.
  • For virtual proxy servers, it is recommended to configure multiple proxies with maximum 8 vCPUs to avoid co-stop scheduling issues. 2 virtual proxy servers are required.

If you need to achieve a 2x smaller backup window (4 hours), then you may double the resources - 2x the amount of compute power (split across multiple servers).

The same rule applies if the change rate is 2x higher (20% change rate). To process a 2x increase in amount of changed data, it is also required to double the proxy resources.

Note: Performance largely depends on the underlying storage and network infrastructure.

Required processing resources may seem too high if compared with traditional agent-based solutions. However, consider that instead of using all VMs as processing power for all backup operations (including data transport, source deduplication and compression), Veeam Backup & Replication uses its proxy and repository resources to offload the virtual infrastructure. Overall, required CPU and RAM resources utilized by backup and replication jobs are typically below 5% (and in many cases below 3%) of all virtualization resources.

How Many Tasks per Proxy?

Typically, in a virtual environment, proxy servers use 4, 6 or 8 vCPUs, while in physical environments you can use a server with a single quad core CPU for small sites, while more powerful systems (dual 10-16 core CPU) are typically deployed at the main datacenter with the Direct SAN Access processing mode.

Note: Parallel processing may also be limited by max concurrent tasks at the repository level.

So, in a virtual-only environment you will have slightly more proxies with a smaller proxy task slot count, while in a physical infrastructure with good storage connection you will have a very high parallel proxy task count per proxy.

The “sweet spot” in a physical environment is about 20 processing tasks on a 2x10 Core CPU proxy with 48GB RAM and two 16 Gbps FC cards for read, plus one or two 10GbE network cards.

Depending on the primary storage system and backup target storage system, any of the following methods can be recommended to reach the best backup performance:

  • Running fewer proxy tasks with a higher throughput per current proxy task

  • Running higher proxy task count with less throughput per task

As performance depends on multiple factors like storage load, connection, firmware level, raid configuration, access methods and more, it is recommended to do a Proof of Concept to define optimal configuration and the best possible processing mode.

References