DataStax Python Driver: A Multiprocessing Example for Improved Bulk Data Throughput
The text discusses how to improve the performance of Python applications working with large datasets, which often become CPU bound due to serialization and deserialization processes. It suggests using the multiprocessing package from the Python standard library to distribute work among multiple processes, allowing applications to utilize multiple CPUs. The author provides a detailed example demonstrating how to use multiprocessing with the DataStax Python Driver to achieve higher throughput. They also highlight some trade-offs and considerations when using this pattern, such as overhead costs and latency sensitivity.
Company
DataStax
Date published
June 23, 2015
Author(s)
Adam Holmberg
Word count
1573
Language
English
Hacker News points
None found.