This text discusses scaling a Temporal Cluster, focusing on metrics and terminology that can be used to discuss scaling for any kind of workflow architecture. It provides an overview of the process followed in load testing, which includes setting or adjusting the level of load, checking monitoring to spot bottlenecks or problem areas under the new level of load, adjusting Kubernetes or Temporal configuration to remove bottlenecks, and repeating the process. The text also covers scaling up by increasing shard count, adjusting CPU and memory requests for History pods, and optimizing poller configuration to improve Schedule-to-Start Latency.