Arize Blog Post Summaries

December 2020: 1 post published.

Dec. 22, 2020

Welcome to Arize, Mikyo!

Arize AI has welcomed Mikel King, also known as Mikyo, to their engineering team. Previously a founding member of Engine ML, where he worked on distributed training infrastructure for deep learning, and an Apple Inc. Senior Engineer, Mikyo holds a degree in Computer Engineering and Computer Science from the University of Southern California. He is passionate about Arize AI's mission to ensure accountability in AI systems and looks forward to contributing to the development of tools for ML engineers and data scientists. In his personal life, Mikyo enjoys climbing, mountain biking, woodworking, and cooking with friends.

November 2020: 1 post published.

Nov. 17, 2020

Women In AI To Watch

Despite the rapid growth of the machine learning market, women still only make up 12% of the ML workforce. Maintaining a balanced and diverse workforce is important to create an equitable technological future. Role models are essential in creating a sense of solidarity amongst women in (and interested in) tech, helping to shift gendered perceptions of the industry and catalyze institutional change around hiring norms and a real commitment to all forms of diversity. By sharing stories from female thought leaders in the AI/ML space, we can inspire more young females to enter the workforce and pave their own way into STEM.

October 2020: 5 posts published.

Oct. 28, 2020

Arize AI Selected For insideBIGDATA’s Impact 50 List

Arize AI has been selected for insideBIGDATA's Impact 50 list in Q4 2020, a quarterly list of the most important movers and shakers in the big data industry. Arize AI is recognized as the leading ML Observability platform, designed to troubleshoot, monitor, and explain AI deployed in the real world. The company was founded by leaders in the Machine Learning space to bring better visibility and performance management over AI. With its innovative platform, Arize AI aims to create a more transparent and trustworthy future with AI, enabling Data Scientists and Machine Learning engineers to deploy models with confidence.

Oct. 21, 2020

Arize AI Wins 2020 AI TechAward for Enterprise AI

Arize AI has won the 2020 AI TechAward for Enterprise AI, recognizing its contribution to technical innovation in the AI, Machine Learning & Data Science industry. The company's product/technology was selected based on notable attention and awareness in the industry, general regard and use by the developer & engineering community, and being a leader in its sector for innovation. Arize AI is focused on providing an ML Observability platform to help make machine learning models work effectively in production.

Oct. 19, 2020

Using Statistical Distance Metrics for Machine Learning Observability

Statistical distance metrics are used to quantify the distance between two distributions, which is extremely useful in machine learning observability. Data problems can arise from sudden data pipeline failures or long-term drift in feature inputs, and statistical distance measures provide teams with an indication of changes in the data affecting a model and insights for troubleshooting. Real-world examples include incorrect data indexing mistakes, bad text handling, and software engineering changes that alter the meaning of a field. These issues can be caught using statistical distance checks on model inputs, outputs, and actuals, which allow teams to get in front of major model issues before they affect business outcomes. The reference distribution can be fixed or moving, depending on what is being tried to catch, and different types of distance checks are valuable for catching different types of issues. The PSI metric is a great example of a statistical distance measure with real-world applications in the finance industry, particularly for detecting changes in feature distributions that might make them less valid as inputs to models.

Oct. 16, 2020

Arize AI and Paperspace Announce a Partnership to Bring Deep ML Observability Solutions to Data Science Teams

Paperspace customers can now easily integrate the Arize AI platform for model monitoring, troubleshooting, and explainability, allowing them to gain insights into their models' performance in production and troubleshoot issues quickly. The integration provides a simple pre-tested solution that is easy to set up, enabling teams to monitor data drift and model drift, and troubleshoot problems in a purpose-built platform designed for ML Observability. Arize AI's platform helps teams transition from research environments to production, maintaining the results delivered, and builds trust between research teams and end users by providing explainable and transparent insights into their models' performance.

Oct. 14, 2020

Welcome to Arize, Nate!

Arize introduces Nate Mar as their newest team member, who will be joining the engineering department. Previously at PagerDuty, Nate worked on machine learning infrastructure and alert notification pipelines. He is passionate about ML fairness and transparency, aligning with Arize's mission to help engineers and data scientists better understand model performance in production. In his personal life, Nate holds a degree in Political Science from UC Berkeley but discovered his passion for software engineering after starting his career in tech. When not coding, he enjoys playing the cello, learning Chinese, and reading cookbooks.

September 2020: 1 post published.

Sept. 17, 2020

ML Infrastructure Tools for Production: Part 2 — Model Deployment and Serving

The article discusses the various stages of Machine Learning (ML) infrastructure and their functions across the model building workflow. It highlights that ML Infrastructure platforms are crucial for businesses looking to leverage AI effectively. The three main stages of the ML workflow include data preparation, model building, and production. Each stage has specific goals and challenges that need to be addressed when choosing an appropriate ML Infrastructure platform. The article then delves into Model Deployment and Serving, which is the final step in the ML process. It explains various serving options for models, such as internally built executables, cloud ML providers, batch or stream hosting solutions, and open-source platforms. The decision on which model serving option to choose depends on factors like data security requirements, managed or unmanaged solutions, compatibility with other systems, and the need for GPU inference. The article also discusses deployment details, implementation methods, containerization, real-time or batch models, and features to look for in model servers. It concludes by mentioning some ML Infrastructure platforms for Deployment and Serving, such as Datarobot, H2O.ai, Sagemaker, Azure, Google Kubeflow, and Tecton AI.

August 2020: 3 posts published.

Aug. 31, 2020

Arize AI Named TiE50 Award Winner at TiEcon

TiE50 Awards recognize innovative startups like Arize AI, a pioneer in machine learning observability platforms, which helps businesses troubleshoot and explain their models' performance, making it crucial for successful model deployment. The awards program is ten years old and has attracted high-potential startups from different parts of the world, providing recognition and opportunities to pitch to investors and entrepreneurs. Arize AI's platform provides a real-time solution to monitor, explain, and troubleshoot issues as models move from research to production, showcasing its innovative approach in the ML observability space.

Aug. 8, 2020

Welcome to Arize, Manisha Sharma!

Manisha Sharma has joined Arize AI's Frontend Engineering team. Previously, she worked as a Frontend Engineer at Slack and on data visualization and design systems at Pandora Music. She holds a bachelor’s degree in Cognitive Science from UC Berkeley. Manisha is passionate about inclusiveness, accessibility, and transparency in the tech field. She has taught programming classes with organizations like Black Girls Code, Girls Who Code, and Code Nation. Manisha envisions a future with responsible and ethical AI and believes that it's necessary to build tools that provide insight and explainability to critical decisions made by AI.

Aug. 5, 2020

ML Infrastructure Tools for Production (Part 1)

The machine learning infrastructure space is complex and crowded, with various platforms offering different functions across the model building workflow. Understanding the goals and challenges of each stage of the workflow can help businesses make informed decisions on which ML infrastructure platforms to use. The production environment is a critical part of the model lifecycle, where the model touches the business and makes decisions that improve outcomes or cause issues for customers. However, transitioning from a research environment to a production engineering environment poses unique challenges, such as moving from rapid experimentation in Jupyter Notebooks to software engineering environments with version control, test coverage analysis, and reproducibility. Model validation is critical to delivering models that work in production, involving testing model assumptions, demonstrating how well a model will work under different environments, and ensuring the model's performance matches expectations. ML infrastructure tools can help with model validation by providing repeatable and reproducible tests, enabling organizations to reduce time to operationalize models and deliver models with confidence.

May 2020: 3 posts published.

May 14, 2020

AI in the Time of Corona

The coronavirus pandemic has created an extreme environment that is challenging machine learning (ML) models trained on previously seen observations. Businesses with live production models are facing issues as these models make incorrect decisions based on data they have never encountered before. This article discusses the challenges faced by AI/ML models during such black swan events and provides best practices to build resilience in production AI/ML during outlier events and extreme environments. These include tracking and identifying outlier events, deciding on a model fallback plan, finding look-alike events, building a diverse portfolio of models, and understanding the uncertainty of model predictions when performance cannot be improved.

May 14, 2020

ML Infrastructure Tools for Model Building

The machine learning workflow is broadly divided into three stages - data preparation, model building, and production. Model Building involves understanding business needs, feature exploration and selection, model management, experiment tracking, model evaluation, and pre-launch validation. Various ML Infrastructure companies offer platforms for different functions within the Model Building stage. Some of these include Alteryx/Feature Labs, Paxata(DataRobot), H20, SageMaker, DataRobot, Google Cloud ML, Microsoft ML, Weights and Biases, Comet ML, ML Flow, Domino, Tensorboard, Fiddler AI, Arize AI, and Stealth Startups. The challenges in Model Building include reproducibility of models, understanding model performance, and ensuring the model's performance in the experimental stage translates to real-world scenarios.

May 14, 2020

ML Infrastructure Tools for Data Preparation

The text discusses the importance of Machine Learning (ML) Infrastructure platforms for businesses across various industries. It breaks down the ML workflow into three stages - data preparation, model building, and production. Data preparation is a crucial stage where raw data is transformed into inputs for training models. This involves sourcing data from different stores, ensuring completeness, adding labels, and transforming data to generate features. Various tools and platforms are available to assist in these tasks, such as Elastic Search, Hive, Qubole, Scale AI, Figure Eight, LabelBox, Amazon Sagemaker, Trifacta, Pixata, Alteryx, Spark, DataBricks, Domino, Databricks, Cloudera Workbench, and others. The text also highlights the challenges faced in data preparation, such as sourcing data from multiple locations, ensuring completeness, and maintaining clean data. It emphasizes the importance of tracking versioned data transformations and using feature stores to reduce duplicative work and compute costs.

March 2020: 2 posts published.

March 17, 2020

The AI Ecosystem is a MESS

The AI ecosystem is still in its early phase, with many companies struggling to integrate AI into their businesses due to lack of resources and expertise. The team behind this blog has experience working at top tech companies like Uber, Google, Facebook, and Adobe, and aims to cut through the hype surrounding AI by providing insights and guidance for making informed investment decisions. They focus on companies that are building software solutions to empower Data Science teams, rather than those selling software to vertical business solutions. The team identifies several key challenges in the AI space, including figuring out the right problems to solve with AI, dealing with complexity and software engineering issues, and evaluating ML/AI software companies. To address these challenges, they propose a simple mental model for sectioning products into pre-production or production, and categorizing them based on their stage in the ML workflow (Data Prep, Model Building, and Production). They also offer guidance on how to evaluate AI company pitches and provide resources for further learning.

March 10, 2020

Welcome Tsion Behailu!

Tsion Behailu is excited to join Arizer's engineering team and bring her expertise in Google Drive, Waze Ads, and Android Partner to the company. She previously led a research project that aimed to increase access to markets and knowledge for Kenyan smallholder farmers using ICTs. Tsion believes in an ethical AI world where decisions are accountable, fair, and transparent.

February 2020: 1 post published.

Feb. 18, 2020

Why We Exist

Arize AI, a company founded by experienced professionals from Berkeley EECS, Uber ML infrastructure, Google & Facebook Engineering and TubeMogul/Adobe real time analytics/statistics team, aims to address the challenges of deploying AI models into the real world. The company focuses on Production ML, offering an AI Observability platform that helps companies identify issues with their models, understand where they are failing, and provide explanations for these failures. Arize AI's approach is based on the belief that there is a significant need for best-in-class, vertically focused solutions in the complex world of ML/AI industry. The company aims to help businesses improve their model performance by providing valuable insights into their production models. With more companies using more models deployed in critical business functions than ever before, Arize AI's mission is timely and essential for ensuring the successful deployment of AI in various industries.