This May, Baseten is focusing on AI events, multicluster model serving, tokenizer efficiency, and forward-deployed engineering. The company is hosting tech talks and workshops in New York, San Francisco, and online, covering topics such as AI phone calling, LLM optimization, and async model inference. Baseten's new multicluster architecture enables enhanced model serving, while comparing tokenizer efficiency across LLMs can provide accurate performance metrics. The company is also hiring forward-deployed engineers to join its growing engineering team, with the role offering a mix of engineering, sales, and customer support. Additionally, Baseten is highlighting its focus on open source AI and inviting readers to apply for its upcoming positions.