Company
Date Published
Aug. 27, 2020
Author
Brian Mortimore
Word count
6572
Language
English
Hacker News points
None

Summary

In this webinar, Brian Mortimore from DataStax discussed the process of migrating enterprise applications from relational databases to Apache Cassandra using CDC (Change Data Capture) and ETL (Extract, Transform, Load). He explained that the first step is to identify a business function and define its API object. Then, the data model in Cassandra is built based on this definition, and DDL (Data Definition Language) is provided for implementation. The CDC pipeline is established to keep the replica updated in real-time, while the query-based pipeline helps with migration by transforming legacy data. The methodology involves accelerating the CDC pipeline, connection, and query-based pipeline, as well as enabling the operational data layer and delivering a spring framework-based microservice architecture for CRUD operations. This approach allows customers to take incremental bites of their transformation journey, starting with the most painful process and moving on to bigger transformations over time. Some key points from the Q&A session include: 1. Handling multiple applications using the same reference data during migration - The existing systems can be migrated over time to the new microservice architecture, allowing for a live migration without cutting off access to the source data. 2. Creating Cassandra data models before starting CDC - It is necessary to create the Cassandra data model before starting CDC because it serves as the destination for the captured data. 3. Repair node impact on high throughput clusters with sub 10 ms response times - DataStax has developed Node Sync, which significantly improves repair efficiency and performance in newer versions of Cassandra. 4. Best modeling tool for creating NoSQL models - Brian Mortimore prefers using text modeling tools like Notepad and focuses on the query-based approach to data modeling. 5. Consistency issues with multiple spring base API microservices connecting to the same database - Cassandra is an eventual consistency system, and it's essential to follow proper consistency rules when designing applications that interact with Cassandra. 6. Migrating from MongoDB to Cassandra using CDC - It is possible to migrate data from MongoDB to Cassandra using CDC. 7. Managing/planning for growth/cost in a NoSQL environment - Focus on identifying and eliminating extraneous data, ensuring that the data model can scale appropriately, and aligning growth with business objectives to justify costs. 8. Resolving data conflicts in CDC inquiry-based pipelines - Avoid pipeline collisions by designing the data model to prevent multiple pipelines from updating the same data objects. Use lightweight transactions or error checking if necessary. 9. Sample applications for learning DDDD (Domain-Driven Design) and other concepts - DataStax offers introductory classes and training materials through their DataStax Academy, which can help newcomers to the NoSQL world grasp these concepts.