Company
Date Published
June 14, 2023
Author
Arun Nanda
Word count
2364
Language
English
Hacker News points
None

Summary

This article demonstrates how to perform basic statistical analysis using PostgreSQL built-in functions. It covers the mean, variance, standard deviation, coefficient of variation, outliers, covariance, correlation, and regression. The advantages of running these analyses within the database include fewer IT systems to manage and maintain, avoiding passing data back and forth between different systems, and leveraging a mature RDBMS for enforcing data integrity and consistency. The example dataset used in this article is based on cancer statistics.